Welcome to the seventy-seventh episode of Access On, the National Federation of the Blind's Technology podcast.
Episode
Listen to the seventy-sixth episode of the Access On podcast (Browser).
Or listen on your preferred podcast platform.
Timestamps
In this week's episode of Access On:
- Microsoft Ability Summit 0:00
- Recap of Google IO 16:40
- Apple previews some new accessibility features 34:48
- Tips for spotting deep fakes 43:47
- Possible Brava smart oven alternatives 47:32
- Three utilities from Jamal Mazrui 51:39
- Note that some of the links provided below are direct links to the installers. Microsoft Edge may warn that the files are not commonly downloaded.
- Turning a print score into ABC notation 56:00 Closing and contact info 59:10
Transcript
Speaker 1:
Live life you want.
Jonathan Mosen:
Welcome to Access On, the technology podcast of the National Federation of the Blind. This week we bring you news from and around Microsoft's 16th annual Ability Summit. Google's I/O Conference showcases AI developments and new audio glasses coming soon. Apple gives us a sneak peek into accessibility features coming to its new operating systems. And in an era of increasingly capable AI, how can you detect a deep fake?
It's Jonathan Mosen at the Jernigan Institute in Baltimore, Maryland, welcoming you to episode 77 of the podcast. It's been a very busy time for tech news, so we have plenty to tell you about today. So let's start with the Microsoft Ability Summit. It began in 2010 as an international gathering of about 20 people and it's grown into a global event that drew more than 20,000 attendees, most of those virtual for the keynote, from 164 countries to the 2025 edition. I haven't seen data on this year's yet.
This year's summit ran on May 19th and 20th with the mainstage keynote on the morning of the 19th Pacific Time hosted at the Microsoft Conference Center in Redmond and streamed worldwide. It's a pretty impressive facility this, and I can tell you that firsthand because I had the privilege of speaking during the keynote representing the National Federation of the Blind. It's a significant moment for Microsoft this year because it's the first Ability Summit under Microsoft's new chief accessibility officer, Neil Barnett, who succeeded Jenny Lay-Flurrie in April. Neil has been at Microsoft for 24 years now and he spent the last 12 advancing accessibility. He built and scaled efforts including Microsoft's neurodiversity program and the Disability Answer Desk, which many of us use. And Microsoft tells us that the Disability Answer Desk or they call it DAD internally, which is kind of cute, has supported more than two million customers since 2013.
Now, Jenny hasn't gone anywhere. She has moved into a wider role. She's now vice president leading Microsoft's trusted technology group. When Neil was appointed in April, Microsoft issued a press release and he said in that release, and I'm quoting, "The choices we make now will determine who technology truly works for and accessibility must be part of that foundation, not added later, not assumed," end quote. That tone of building accessibility in from the start rather than bolting it on, ran through everything Microsoft announced and it's something that the National Federation of the Blind has continued to say for a long time. If you're going to do accessibility right and most efficiently, that's done when representatives of the organized blind movement are around the table at the product conception stage, not handed something that's complete and told, "What do you think of this?" Because inevitably, if the product isn't as good as it should be or could be, we are going to respond and it just delays the process.
So that concept of co-design is something we are passionate about. Now, when I was up there as part of the keynote for the Microsoft Ability Summit, we announced a partnership between Microsoft and the National Federation of the Blind on a suite of Copilot training modules. We've been arguing that as artificial intelligence becomes embedded into every productivity tool a blind worker, student, or homemaker uses, training has to keep pace. And it also has to be what I would call culturally appropriate. There is a lot of training out there for Copilot and other tools, but it's not the kind of material that you can just take and put on a device like a Victory to Stream or an iPhone, hear the computer as the work is done and follow along. And that's the kind of training that many blind people prefer and learn best from.
It's not enough for Copilot to be technically accessible with a screen reader. Blind users need confident structured instruction in how to prompt Copilot well, how to verify its output, how to recognize hallucinations, and how to integrate Copilot into the existing screen reader workflows in Word, Outlook, Excel, PowerPoint, and Teams. The new training modules, which Microsoft and the Federation announced jointly at the summit will be free, hosted on Microsoft Learn and developed with input from blind subjects matter experts at the Federation's Center of Excellence and non-visual accessibility. These modules will sit alongside Microsoft's existing accessibility skilling program, which has now reached more than five million learners worldwide. And there's another announcement for Microsoft that I think many people will be absolutely delighted with. We certainly are.
Since 2024, Microsoft has all but required laptop manufacturers to ship a dedicated Copilot key on new Windows PCs. On most of those keyboards, the Copilot key took the place of either the right-hand Ctrl key or the Context key, also sometimes known as the Applications key. For sighted users who never used those keys, the change passed without too much complaint, but for screen reader users, it broke some established workflows in serious ways. Over the last year or so, we have strengthened our advocacy efforts with Microsoft on this topic, urging them to make this Copilot key configurable so that there will be a simple way that you could go in and revert the Copilot key to its previous function or map it to whatever function that you want.
Microsoft has now confirmed in its own support documentation that in its words, customers who rely on the right Ctrl key or context menu key for keyboard shortcuts or assistive technologies such as screen readers experienced challenges to their workflows when using these devices. A Windows 11 update arriving later this year will add a setting under settings, then Bluetooth and devices, then keyboard that lets users remap the Copilot key to behave as either the context menu key or the right Ctrl key. The remap option is expected to ship as part of the Windows 11 2026 update, which Microsoft Insiders are already testing with broader rollout expected in October or November. Enterprise and education editions will be able to manage the setting through mobile device management policies. That means that schools and workplaces can deploy the remap on behalf of the user who needs it.
It's nice to know that these discussions and the letters have borne fruit and we appreciate Microsoft listening and acting on this important issue. Microsoft seems to be going all in on Narrator at the moment, the screen reader built into Windows. We've covered some of the changes that have been deployed to Narrator in past episodes of Access On. It is becoming increasingly viable to use, particularly if you confine yourself to the Microsoft ecosystem. I think it is fair to say that if you go off the beaten path a bit, if you want to use a program where some sort of configuration scripting, whatever is necessary, Narrator is still not there.
But if you're the kind of user who primarily uses Microsoft 365 applications and stays in a web browser, you may find Narrator is enough. One of the most significant announcements to come from Ability Summit is the rollout of H-I-D or HID Braille support. This is the whole concept of being able to plug and play with a Braille display, just like you were plugging in a USB keyboard or a printer or some other peripheral, no need for proprietary drivers. And we're finally starting to see the promise of HID become a reality. Many of us have in the past been burned by trying Narrator's Braille support only to find it was very difficult to have it let go of the Braille display so we could use another screen reader with our Braille display again. This should be a thing of the past as the HID Braille support starts to roll out, so this is a welcome development.
For many users who just need Braille to work, this was one of the remaining reasons why they tended still to avoid Narrator so I think more people will try it. One of the other ones is the lack of Eloquence in the product. Of course, text-to-speech is a very subjective thing and not everybody likes Eloquence, but a lot of people do like Eloquence. And you can see this when you look at the lengths that people go to try and get Eloquence working in NVDA. You also see the celebrations that occurred when Apple introduced Eloquence. And I think it is fair to say if Apple can do it, then why can't Microsoft do it? Eloquence in Narrator is probably one of the remaining big ones that might cause people to give Narrator a bit of a miss. And I think if Microsoft introduces Eloquence to Narrator, then this will be very welcome by a lot of people. So let's hope that they can get this done.
Another area of interest is the support for Narrator that has been rolled out to the Word editor. This is the thing that you get when you press F7 and it checks your spelling and grammar. I don't do that very often because I don't need no grammar checker. In the blog posts that announced this, Microsoft said that reviewing and fixing spelling and grammar issues will now be faster, clearer, and more efficient. They acknowledge something that blind professionals have been saying for a long time now. As they put it, those of us who use screen readers must take explicit steps to identify issues that cited users catch implicitly through red squiggles. And when moving through errors, Narrator was previously announcing too much information at once, repeating labels, reading spelling at full speed, and demanding extra keystrokes. The new Narrator experience changes this in several concrete ways.
When you land on an error, Narrator now announces information in a deliberate order. First, the type of issue, whether it's a spelling error or a grammar error. Second, the problematic word or phrase. And third, the surrounding sentence context. Fourth, the available corrections. For spelling errors specifically, Narrator now slows down temporarily when spelling out the incorrect word so that even if you normally listen at a fast speech rate, you can clearly hear which letters are wrong. They've also tightened the keyboard model, which is a great experience. You can press the number one to activate the first suggestion, two for the second and so on. You can press the letter I to ignore the error once, G to ignore all, and A to add the word to the dictionary. So it's kind of back to the future really because it's sounding more like the old spell checker experience that many of us used for years and liked and got on well with.
I don't know at this stage the degree to which all of these changes filter through to other screen readers. They were focusing on Narrator, which of course they're now pushing pretty hard. If you tab and use the arrow keys to navigate, that's been refined to reduce the number of keystrokes needed to move through the pane. You can try this if you have Word version 2601 Build 19725.20126 or newer and you have to be running Windows 11, 24H2, 25H2, or 26H1. So update your Word, see what it's like and we can ascertain whether these things are rolling out to other screen readers or whether this is a Narrator-exclusive feature. Microsoft also took the opportunity to reinforce and expand several Narrator features that began shipping in late 2025 and continue to roll up. Now these include rich image descriptions which you get by pressing the Narrator key plus Ctrl plus D for description.
And they now run not only on Snapdragon-powered Copilot+ PCs, but also on AMD and Intel Copilot machines. If you press the Narrator key plus Ctrl plus S, you'll be asking Copilot to describe the full screen with the image only shared after you choose to describe it, which keeps you in control of what leaves your device. And Microsoft, of course, is pretty proud of these new high definition voices for United States English, which are powered by Azure's latest on device text-to-speech models. What I've found and my own use is that it's sometimes nice to be read to if you're not in a hurry and you want a human sounding voice and of course people's text-to-speech preference varies, as I said before. So some people will really like this. Other people just want to crank up the Eloquence, so it's important to meet both use cases.
These voices are pretty interesting because they use generative AI to adjust tone and pacing and Microsoft says they sound clearer and more natural when they're reading longer documents. What I find really fascinating is that you can read the same document, say the same paragraph to try this out several times and the inflection changes. So they are real AI voices, if you will, and they will read differently every time you read the same passage. Now some real innovation here that I want to point out on Copilot+ PCs, you can now customize how Narrator announces different control types by typing a plain language instruction. For example, if you don't want Narrator to announce position information or whether text is selected or not, you can type that instruction in a natural language way into an edit box, preview the change, and then ask Narrator to apply it.
It's not irrevocable if you change your mind, a single command resets Narrator to its default behavior. Microsoft is continuing to make some consolidations to the Copilot's interface. So if you have a newer version of Microsoft Word, you may already be seeing this. I did have someone contact us quite concerned that Alt + I in their Microsoft Word went away and if you use Copilot regularly, you will know that Alt + I is the one that you can just press and give it an instruction to draft you something and then it goes away and does it. And then there was Alt + H + FX to get the deeper Copilot experience in Word. So if you open Word one day and you are used to using Alt + I and you press that key and it goes ding, don't panic. It just means that you have got the new Copilot experience rolled out to you.
And what you do is there's just on key to press now for all things Copilot and that is Alt + C, a good mnemonic Alt + C for Copilot. You may, if you have a Copilot license, want to take a look at some of the agentic things that are rolling out to Microsoft's Word at the moment because Copilot has become capable of doing some very nice editing that goes well beyond the somewhat blunt instrument of the global search and replace that we've all been using for years. For example, I had a document given to me which started life as a plain text file and there were lots of sections in this document and each section was separated by three asterisks so that you could push Ctrl + F for find, type in three asterisks and navigate to the next section. So I loaded this document into Word with the Agentic Copilot humming away and I said to Copilot, every time you see three asterisks, I want you to generate a heading level two.
And for the text of that heading level two, look at what that section is about from one set of three asterisks to the next and I want you to do a very, very concise summary suitable for a heading. And it did it and it made the document much more navigable because then I could just use quick keys in my screen reader, press H to navigate by heading and hear what each section was about. That is real time saving stuff that AI is delivering. So those are some of the interesting things from Microsoft's Ability Summit. I was delighted to be there and to network with a lot of people who are in Microsoft and of course in the wider industry.
Let's move on to Google I/O 2026. There is so much here. It was a tidal wave of AI announcements and of course we're going to have a look at some of those through an accessibility lens. Google has announced a feature called Gmail Live and this lets you have a spoken conversation with your inbox. Now when we do our Apple recaps, we talk sometimes about Sherlocking, which is an expression that gets used in Apple land when Apple comes along and essentially pulls the rug from under a third-party app. And I think it's fair to say that with this announcement, even though Sherlocking, as far as I know, tends not to be used in a Google context, Google has Sherlocked some of these third party apps that are doing this very thing. In the demo, Google product leaders asked questions of Gmail such as who was the plumber that gave me a quote for the bathroom renovation last year and what events does my son Jack have at school this week? Gmail Live, which is powered by Gemini pulled the answer out of buried messages.
It spoke those answers and then of course it was possible for the question is to ask follow up questions. What was pretty impressive about this was that the feature did understand the difference between a trip and a field trip and it handled hotel room numbers and flight details. You could also shift topics within the same conversation. Google says they're rolling out similar voice features to Google Keep, which will let you sort through ideas and have them automatically sorted into separate notes. And it's also coming to Google Docs through a feature that they're calling Docs Live. So through this, you can build a document by talking to it and it's a process so you can keep talking to it until you get the tone and the structure that you're after. Because this is steeped in the Google ecosystem, you are able to reference documents that Google can see.
For example, if they're in Drive or if you've got email in Gmail all through your voice. Now these Gmail Live, Docs Live, and Keep Voice features are not free. They're going to roll out this summer in the Northern Hemisphere to subscribers of Google AI Pro and Google AI Ultra in English, starting in the United States with Google Workspace business customers getting access in preview.
Let's move on to wearables, specifically glasses and even more specifically what Google is calling audio glasses. You may recall that last year we talked about Google's XR platform and how the open infrastructure of XR may be quite disruptive and beneficial for the blind community. Now we're starting to see how this is shaping up. Google have partnered with Samsung and they've unveiled the first Android XR intelligent earware products. These have been designed in conjunction with the fashion brand's Gentle Monster and Warby Parker.
Now, if you're going to go shopping for these when you eventually can and you can't right now, it's important to understand that there are two broad product categories. The first is audio glasses and they're going to deliver spoken help in your ear through small built-in speakers without any visual display. So clearly they are targeting the Meta glasses here full on. The second category is the display glasses, which also project information into the lenses. So there you've got a competitor to the Meta display glasses. Those do have a screen reader on them and there are blind people using the Meta display glasses. It's not clear to me at the moment whether the Google equivalent is going to have a screen reader or not. But I think for most of us, the audio glasses are by far the most interesting category. The pitch from Google was straightforward. You wear what looked like ordinary glasses. You either say, "Hey, Google," or you tap the side of the frame, Gemini wakes up and answers your questions about what's around you.
Now, what's very interesting from a blindness perspective is that these glasses are going to give you turn by turn walking directions while knowing which direction you're facing. So it's going to see your environment and give you the typical Google Maps experience, but it's going to augment that environment with visual information that it's picking up from the camera. It'll also read menus and signs allowed. It will translate in real time. It will, of course, take photos and it's going to connect to a range of third-party apps that will include Uber and the language learning app Mondly. Google's pretty consistent about making its technology work for those who use iOS as well as Android. And in this case, they're doing this once again.
They have announced that the glasses will work with iOS as well at launch. Manufacturers are expected to launch these in the Northern Hemisphere fall in select markets. How much does it cost we're all asking? We don't know yet. They didn't announce those details, but we would expect them to be competitive on price with the Meta products. Now we come to what is by far the most controversial part of IO. The changes to Google Search. Now, Google has described this redesign as the largest single change to its search box in roughly 25 years. I don't think that's an exaggeration at all. Some people have even gone as far as saying Google Search isn't Google Search anymore. That may well be true. The new search includes an intelligent search box and it expands as you type to handle longer, more conversational queries. It includes AI powered suggestions that try to anticipate intent.
It includes AI mode, which is the dedicated chat style search experience, which is now powered by Gemini 3.5 Flash and which Google's head of search, Elizabeth Reid said already serves more than one billion users per month and is doubling in query volume every quarter. The new search also includes background information agents that monitor topics for you 24/7. This is in a way, I think, the next evolution of Google alerts. If you've not used this, it is actually still around and it allows you to search for specific things. You might use it to get an alert about the company that you work for or some particular artist or subject matter that you're interested in. And once you've set up an alert, you can subscribe via an RSS feed or you can have it email you. And I actually still do this. I use a Google Alert for blindness related topics.
I have fine tuned this Google alert over time and it now does send me really good quality blindness related articles, but this looks like the next iteration of that. So you can set the agent to track, for example, the price of a flight or the housing market in a neighborhood or a sports team standing. I'm thinking about the Orioles have been going up and down and the agent will alert you when something changes. There's also something called generative UI. We've heard of generative AI, this is generative UI standing for user interface. And what happens here is that the search builds a custom layout on the fly for complex questions, including dynamic tables, interactive visuals and dashboards. We'll be keeping track of the accessibility of this to make sure that we're not being excluded. There's even a new shopping experience, which is called Universal Cart that's built on Google Wallet and that tracks prices and offers across the Gemini app, YouTube, and Gmail.
I said when I introduced this that it's the most controversial part of I/O by far. And let me try and delve into why that's the case. Publishers, journalists, small website owners and disability advocates, in fact, have all been sounding the alarm about what AI search means for the web that produces the content these AI systems summarize. The point is that traditional Google Search used to drive traffic to the websites of these organizations and that was important. And there's been a bit of an encroachment over time. This isn't necessarily just an AI thing. Google's had those top results for some time that has meant that some people haven't bothered to click through to a website. Now we're at the position where zero click searches in which the user gets an answer without ever visiting a third-party site now account for roughly 60% of Google queries.
Now for news related searches, that figure actually rose to 69% in the year after AI overviews launched. National Public Radio, which of course is facing all sorts of challenges at the moment has called the trend an extinction levered event for online news publishers. Now you may be inclined to say, "Well, all industries get disrupted at one time or another and these guys will just have to adapt." And perhaps to some degree that is true, but it's important to point out that the AI answers that we are going to increasingly read are only as accurate as the model behind them. So if you are not going to the actual source of the information but you're having something summarize that information to you, there is a risk you're not getting the full picture. There's also a risk that it's just going to make it up and get it completely wrong.
So if Google is going to make it harder to actually click through to verify the source when verification is important, that is a concern. Google has already rolled out some mitigations including inline links, hover previews on desktop, which hopefully will be accessible, a subscriber label for new subscriptions, article suggestions at the end of AI answers and community perspectives that quote Reddit forum posts. The United Kingdom's competition and markets authority has proposed regulations that would require Google to give publishers effective opt-out controls. So there's another danger there that if Google's going all in on this pathway and the options exist for publishers to opt out, you may not be getting the full picture anymore. So this is in quite a state of flux right now. There's an enormous amount of social media and tech chatter about this and we'll see how it shakes down over the next few months.
In the meantime, there are, of course, alternative search engines such as DuckDuckGo. You might want to try that by going to duck.com. And another very impressive search engine that I've never had to speak out loud before. So I actually did have to ask AI how I'm supposed to pronounce it and it tells me I pronounce it kaji. It's spelled K-A-G-I.com. This one is very impressive. You do have to pay for this one to get the full experience, but it's not a huge amount and they keep it ad free as a result because it's user funded on Gemini itself. Google announced Gemini 3.5 Flash and previewed Gemini 3.5 Pro. Both models target faster, more agentic tasks. This is a new word we're hearing all over the place in AI. Agentic simply means there are agents involved in this. In Google's own wording from the I/O 2026 announcement, when looking at output tokens per second, Flash is four times faster than other frontier models. I'm sure people will be beating up on that to check the specs.
The company also involved Gemini Omni. This might be one that some blind creators wish to play with because you can give it any kind of source and it will turn that into video from the input. As someone who is embracing this technology but also has a kind of a trust bit verify mentality about it. Initially, at least if I were to use this, I would generate some video content and then verify it with something like PiccyBot or even ask a trusted human. Imagine that, to make sure that what I'm going to send out into the world is what I think I am sending out into the world. There's also Gemini Spark. That's a personal AI agent that in Google's words takes actions on your behalf to help you navigate your digital life.
Spark integrates with Gmail, docs, and other workspace apps and it'll expand to third-party tools through the model context protocol. It's rolling out to Google AI Ultra subscribers in the United States only. Google has also redesigned the Gemini app with a more visual interface, which is called Neural Expressive and it adds animations, timelines, and inline image alongside the text response and Gemini Live now launches in an inline rather than a full screen experience. I've been on the road and I'm recording this as soon as I got back off the road, so I haven't had too much time to play with Gemini, but I haven't seen any regressions myself. If you have noticed any changes with this new experience, please do let us know how you're getting on. Subscribers received both good and difficult news on pricing and limits at I/O and this is probably the second most controversial thing beyond all the changes to Google Search.
Google has introduced a 100 dollar a month tier for Google AI Ultra and they've reduced the previous Ultra plan from 250 to 200 dollars a month. However, where it gets controversial is what they've done at the lower end of the spectrum. It's basically a lot more user pays. They've switched from a daily prompt limit to a compute used model. Now they calculate your usage based on how complex your prompts are, what features you use, and how long your chat is. Limits refresh every five hours until you hit a weekly cap. Reaction's been mixed and in some quarters very frustrated about all of this. Users on Reddit and elsewhere have reported burning through significant percentages of their five-hour windows with single prompts on Gemini 3.5 Flash. This is such a competitive space that everybody listens pretty closely to an outburst of feedback and as a result, Google has already tripled the rate limits in its anti-gravity coding tool twice in response to user complaints.
This is not new. It's not unheard of a new technology. A company rolls out something exciting. They want to put it in your hands. They want it to become a part of your workflow and then they start to realize that they don't have an economically viable model so they start charging a lot more to try and make it cost effective for them. Let's see what happens over the next few weeks. This is a really fluid development.
Mac users will be pleased to celebrate the arrival of a Gemini app for Mac OS. There's a new image creation and editing app, which is called Google Pics, which is based on the Nano Banana Model family. There's expanded conversational search in YouTube. This is called Ask YouTube. There's a redesigned maps, experience, and continued investment in agentic shopping protocols. Android Halo, which is a new status by element, will show the progress of Gemini Spark and other agents at the top of your phone screen so you can see what your agent is doing without you having to leave the app that you're in.
Google is continuing to work on talk back and this includes expanded image descriptions, expanded Braille display support where you can now set your own key maps for the display so that it's very welcome and improvements to the Magnifier app on Pixel devices. We're going to take a break and when we come back, we'll have a look at recent accessibility announcements from Apple as we prepare for WWDC 2026 as Access On continues.
Make a difference with the National Federation of the Blinds' Keep NFB Advancing Give 25 campaign. Each year, thousands of federation members and friends contribute to support blind people, but we still need your help to fund our programs in 2026 and beyond. When you give 25 dollars between May 15th and July 7th, you're entered into the Keep NFB Advancing drawing. Each 25 dollars increment is a chance to win. Your support helps us continue to build a network that advances the lives of all blind people across America. You could win prizes like round trip transportation for two to the 2027 NFB National Convention, hotel accommodations, registration, banquet ticket, or 2,000 dollars cash.
Speaker 3:
And you can double your dollars up to 25,000 dollars. Thanks to a gift from Aira, the visual interpreting service.
Jonathan Mosen:
Want a chance to win the BrailleNote Evolve from HumanWare? Become a federation connector. Ask friends and family to contribute before national convention and indicate that you prompted their giving. The Give 25 drive supports the Kenneth Jernigan Fund, Sun Fund, Tenbrook Memorial Fund, and the White Cane Fund. You can choose a fund when you donate. To enter, visit nfb.org/give25donate. That's nfb.org/give25donate. Call 410-659-9314, extension 2430, or send a check to National Federation of the Blind and mention #Give25 and the fund in the memo. The winner will be announced July 8th, 2026. Thank you for your generosity.
This is Access On from the National Federation of the Blind. Global Accessibility Awareness Day fell on Thursday, May the 21st this year. It's always the third Thursday in May and a tradition that's now in its fifth straight year. Apple posted its preview of new accessibility features on the Tuesday before May the 19th. The announcement covers features expected to ship in the 27 range of operating systems from Apple and we'll learn a lot more about those operating systems at the Worldwide Developers Conference on June 8th. Compared with the 2025 announcement, you could argue that things are a little bit thin this year. 2025 delivered Braille access, magnifier for the Mac, the accessibility nutrition labels on the App Store, live captions on Apple Watch, and a major expansion of Braille device support. So that was pretty substantial stuff. There are no new major Braille features in this announcement and the list of bullet points is shorter. There's no doubt about that.
And the 2026 list is consistent with the general strategic direction we are expecting from these operating systems, which is that Apple is finally trying to deliver on Apple Intelligence. There's a perception that Apple is behind in the AI race and of course they got badly burnt promising things relating to Siri that they could not deliver. So this year we are likely to see a deepening of Apple Intelligence right across the operating system and these accessibility announcements seem consistent with that vision. Tech journalists who seem to have good inside information at Apple, particularly Bloomberg's Mark Gurman, who's a very reliable source usually, seem to be indicating that this is one of those years where Apple's going to focus obviously on Apple Intelligence, but otherwise hold back on too many major new features so they can focus on fixing bugs. And there may be many people who say hallelujah with respect to accessibility because there have been bugs that continue to creep in.
It's interesting to talk to Mac users who've gone all in on the Mac operating system who love their Macs and the Apple Silicon is beautiful hardware, but they feel like voiceover is neglected on the Mac. And certainly the recent AppleVis scorecard seems to indicate that is a widely held view. Great hardware, shame about the screen reader. So if things could be tidied up a lot with this release, then many blind people would welcome that development. And it's also fair to say that the list that Apple publishes ahead of Global Accessibility Awareness Day is not exhaustive. So we may, when we start getting the betas of AOS27 and all of its siblings see that there are features here that weren't mentioned that are actually quite substantial. So we'll reserve judgment on that. So let's have a look at some of the features Apple says we are getting and many of these relate to Apple Intelligence. That means that you need an Apple Intelligence phone to take advantage of these. Voiceover and Magnifier are getting deeper Apple Intelligence integration. Image Explorer in voiceover will use Apple Intelligence to provide more detailed descriptions of images system-wide.
Apple gives examples of photographs, scanned bills, personal records, and other visual content. They're updating live recognition so voiceover users can pres the action button on the iPhone to ask a question about what's in the camera's viewfinder and receive a detailed response. Are they trying to ... I can use Sherlock again. Are they trying to Sherlock Be My Eyes and Aira Access AI and all those things? Well, I guess we'll wait and see. I would observe though that there are so many things that I would love to assign to the action button. I do hope that Apple in iOS 27 will give us the ability to create a menu on our action button, or if not that, give us the ability to have one action when you tap the button and another one when you hold it. That would be wonderful. You can then ask follow-up questions, by the way, once you get into this mode and it's using onboard Apple Intelligence.
And the significance of this is that we may find that responses could be much, much faster than some of the other services which go out to the cloud. You've got to send your picture to the cloud, it's got to think, and then it's got to send the answer back to you. So having all that happen on device, I think we might be in for a very swift and potentially delightful experience. They're going to improve the accessibility reader to handle more complex source material like scientific articles with multiple columns, images, and tables. It now provides on demand summaries to give an overview of an article before diving into the details and it includes built-in translation that preserves your custom font, formatting and colors. Voice control is one of those features that I use a lot. And the reason why I use it is that sometimes I'm getting ready for work in the morning or I just don't have my phone in my hands and I want to continue to read news articles or engage and I find voice control great for that.
But undoubtedly, the primary audience for voice control is those who have difficulty with using the touchscreen and it's getting some attention this year. With Apple Intelligence, you'll be able to use natural language to describe the interface. Apple's framing is say what you see with examples like tap the orange folder or zoom in on the red text. Hopefully it will be mindful of the actual text on the screen as well. Voice control will be available in English in the United States, Canada, the United Kingdom, and Australia. For deaf and hard of hearing users, Apple is introducing generated subtitles. This is an on-device feature that automatically generates closed captions for videos that don't already include them. This works on iPhone, iPad, Mac, Apple TV, and Apple Vision Pro, and it works with clips recorded on iPhone, on videos received from friends and on streamed online content. Hopefully this will also work in Braille. We'll wait and see for that. Generated subtitles will launch in English in the United States and Canada.
Of course, we are expecting to see finally the new Apple Siri, which is going to be powered by Google's Gemini with some typical Apple privacy safeguards in place there and we would hope to see a significant improvement in dictation. I think that some of these apps that are around at the moment that introduce themselves as a third-party keyboard and give you very good quality dictation that allows you to make mistakes and stumble and fumble and have the AI understand what you're trying to say will be Sherlocked as well. I get to use the term again. And that may be with all of this Apple Intelligence evolution, we may see dictation improving considerably and I hope that will also mean that we see an improvement in the quality of the transcriptions we get from the live captions mode in Braille access.
As someone with a hearing impairment who struggles in some noisy environments or echoey environments, I cannot tell you what a significant game changer. I try to use that term sparingly, but what a complete game changer this is for me. It just takes the stress away from a lot of situations, but sometimes the dictation errors that it makes are perplexing and sometimes very amusing and very occasionally even a bit inappropriate. So hopefully those changes will roll out to that Braille access feature and that will be very welcome for deaf, blind people. And as we head towards WWDC, which will be on June the 8th, I should say that we will amend the usual publication schedule for Access On and we will publish an episode with the usual crew. Recapping the Apple event, we will record immediately upon the conclusion of the keynote and publish as soon as we finish recording. So that will be on June the 8th.
So a staggering amount of really interesting technology news in this episode, and I hope that that recap has been helpful to try and make sense of some of it. If you have anything to add, any comments, any concerns, how you're feeling about it, do be in touch and share your views with us, [email protected] is the email address. That's accesson all joined together at nfb.org. You can contribute in two ways. You can attach an audio clip using the voice memos app or recording on your computer or whatever works for you, or you can just write it down.
We have some time to hear from listeners like you and given all the talk about AI in this episode, it seems appropriate to hear from Catherine Samuel who says, "Hi, Jonathan. I'm embarking on a new side gig, pursuing becoming an accredited financial counselor." Well, good luck with that, Catherine. She says, "In my studies, I've done a deep dive into phishing and identity theft prevention, and this led me down a rabbit hole of researching how AI has allowed scammers to massively up their game and become harder to spot. In particular, I have been very disheartened to learn how good AI is becoming at generating deep fake videos, podcast interviews, or even video conferencing conversations, simulating real people saying and doing things they never actually said or did. Almost all of the advice I found on how to spot deep fakes involves visual clues such as the lip movements not matching the spoken words quite right, eye or hand movement not looking natural, or some tiny discontinuity in an image.
These are often subtle enough that fully sighted people miss them. For those of us who are blind or low vision, relying on visual tells is a complete no-go and videos are everywhere on every social media platform. Even if you're not a big consumer of videos, you may listen to podcasts or find yourself on a Zoom or Teams call with someone you don't know well. So how do we know what's real and what isn't? I did some outreach to quite a few blind colleagues, many of whom work in the tech sector to get their thoughts on how we can spot deep fakes non-visually. I thought I'd share this list that I compiled with your listeners. This isn't validated in scientific study or anything. It's just anecdotal, but I found it helpful and hopefully other listeners will too.
First, video is very short, generally under a minute or two. AI is such a resource hog that long-form videos are still a stretch for it. Next, speaker's voice has the same cadence, rhythm, or vocal intonation throughout. The next item on the list, voices or sound effects have odd timing, pauses where there should not be, or no pause where there should naturally be one. And the next one, voices sound flat or with limited emotion despite having some vocal inflection. And the next one, speech seems too perfect, ums and ahs and even breathing sounds are inserted but don't sound natural. The next one, laughter sounds fake or bland. Background noise is added to make the video sound real, but it lacks authenticity or doesn't match the video's content quite right. And foreign words or names with unusual spellings are mispronounced." Catherine says, "I would be curious if you or your listeners have any non-visual deep fake tells to add to this list.
Thanks so much. That is a brilliant subject, Catherine. Thank you so much for bringing this up. I would certainly be interested in hearing others' experiences of this and whether they have things to add and have you actually fallen victim to a deep fake. Please let us know your experiences and share any tips that you think might help to determine when you may be dealing with a deep fake [email protected] if you want to be in touch.
Let's go back to Brava. I can report that our Brava is still working at home, but it has stopped pushing notifications at this point. So they've taken the push notification server offline, but the app does still work for loading recipes into it. We are somewhat in trepidation about how long that will continue to work. It was an expensive thing to buy and every so often it requires an update and it stops working until we can get some assistance from Aira to update the device.
And when that happens, we sort of wonder, is this the moment where we lose access to Brava? But so far it has continued to work. That's me knocking on the wood there. But Justin Phillips has been in touch. He says, "I don't know how accessible it is, but you might want to try Posha or Posha, P-O-S-H-A.com. It has both an Android and an iOS version. After the disappointment of Brava, this might be an alternative. Do give it a try and let us know. Thanks always for your excellent show. Thank you." That's Justin from India. We may well check that out, Justin. I can tell you that Travis Rea, who used to work for Brava and he provided such exceptional support to those of us who bought Brava devices. And obviously, when the company was shut down, he was out of a job like all the other great Brava people there and it was all very unfortunate.
He has resurfaced. He's doing some accessibility consulting at a company called Tovala. They have quite a different concept, so we'll probably get Travis on the show at some point to describe this, but it is another smart oven. It is considerably cheaper than Brava and it's kind of like the old days of cell phone plans. So these guys at Tovala have packaged meals. Like a lot of these services here in the United States CookUnity and there are various other places where you subscribe to a meal plan. Bonnie and I did this for a while when we were settling in. So we had 10 meals that we would get delivered. We settled on CookUnity and we'd get them delivered so that we didn't have to worry about cooking in the weekday evenings and then the weekends we would cook and the selection changes every week and it wasn't too bad really.
So Tovala is like that. They're in that space. They have various meals to suit different diet preferences. But where it's different is that they have a smart oven that goes with it. And if you commit, say, to a year of subscription, then they subsidize the price of their hardware and it can actually be really cheap to get one of these smart ovens. We are getting one at the Jernigan Institute to evaluate, so we'll have more to say about this in due course, but I do know that the oven itself does not have a touchscreen. It has physical buttons. It has an app that is at least accessible on iOS from what I can gather. What happens is that when you get the meals, there's a barcode and you can scan the barcode either with the scanner that's on the oven itself or you can scan the QR code on your phone and then the oven cooks the meal. It gets the instructions about how to cook the meal, what settings to set.
You just leave it in there and you have a meal at the other end after you've scanned the barcode. So it does sound very interesting. If you want to check it out for yourself, it's spelled T-O-V-A-L-A, tovala.com. It is only available in the United States and their oven does other things. So you can use it for cooking. I think it has an air frying function and various other functions. And one area where it is superior to Brava is that you don't need sighted assistance to get up and running. It connects to the wifi without any kind of assistance. So there's another option in the market. I appreciate that some people might be feeling a little bit burned, which seems a very appropriate thing to say in the context of an oven after the Brava experience, but we'll see what people make of this as they try it out. If you already have tried it, let us know how you're getting on with it.
Let's talk about vibe coding. I'm very excited to see the democratization of software development and already we're starting to see this in our community. There are blind people who have already developed using vibe coding. Some wonderful tools that I'm using myself and many people in the community are using. If you're not familiar with the concept of vibe coding, this is essentially the ability to tell an AI about the software that you want to create. If you don't know any code but you have a really clear picture in your head of the software that you want to develop, then you can now develop it. It's highly advisable to have qualified review of the code and you've obviously got to test it rigorously, but I think there's all sorts of positivity around self-determination, taking control of our own destiny by developing software that works for us.
I think that software development is on the cusp of a very different future because I see the potential for software to become much more personal. I've vibe coded some things that I have no intention of sharing with anybody else. They just work for me. They meet a need that I have. And I think what will happen a lot more in future is that individuals with very specific preferences who have a really clear idea on what it is that they want will develop personalized software that is perfect for them. It's great to see that the barriers to software development are coming crashing down and this stuff is getting better all the time. I say this as a precursor to Jamal Mazrui's email and Jamal says, "I have made major improvements to three free tools recently designed and built with assistance from Claude AI. They are intended to benefit end users, accessibility specialists and software engineers."
I will endeavor to put links to these applications in the show notes if you're interested in getting to them, so do check the show notes. "We have 2HTM, extCheck and URL Check. They are a small family of free MIT licensed Windows command line tools developed by me and shared on GitHub with full C# and Python source code. Each is distributed as a single file 64-bit independent binary executable that can run without an installation step, without a runtime dependency, and without anything in the registry. 2HTM, and that is the number 2 and then HTM, converts Word, Excel and PowerPoint formats to clean accessible HGML using Microsoft's APIs with options for plain text and image stripping. Then there is extCheck, which checks the accessibility of .docs, .xales, .pptx, and .Mdfiles, writing CSV reports of issues found by an extensible rule set based on guidelines from the Microsoft Office Accessibility Checker and the web content accessibility guidelines.
And then we have URL Check, which drives the Microsoft Edge browser through the Playwright API and runs the axe-core testing code on each page producing per page reports plus a draft accessibility conformance report, which is called ACR.xls and ACR.docx covering all 86 WCAG 2.2 success criteria. And offering a spreadsheet with manual tests for all criteria to complement the automated results. The three programs share a consistent interface and a set of friendly features intended to make them equally convenient for the typical Windows user working through a GUI dialogue and for developers automating tasks working through the command line. Because the same options are available either way, the same scan or conversion can be reproduced from a script or scheduled task just as it was performed by hand." Thank you, Jamal. I really appreciate that. Congratulations.
And let's conclude with a message from Peter who is in Budapest and Hungary. He says," Hi, Jonathan. Do you or someone in your audience know how to turn sheet music on paper into ABC notation code or MIDI, or maybe into MusicXML? What I found is a program called Audiveris. As far as I understand, it's a kind of OCR for sheet music. It is free. I'm not particularly expert in reading music. I only bought a clarinet last October and started to learn how to play it. For now, I'm not planning to have sheet music in Braille. I also bought a textbook which is full of sheet music. It is made for absolute beginners, so the pieces are quite simple ones. If I had them in ABC notation, I could listen to the music and also read the notes in the form of regular letters if needed.
Is this Audiveris accessible for screen reader users or should I choose another method? I think I wouldn't want to buy some expensive software. My Yamaha YCL255S clarinet cost a lot. I love it. I'm not complaining, but somewhere you have to draw the financial limits. So I'm interested mainly in free or cheap solutions."
Let me give you a few thoughts and others may wish to expand on them. Audiveris is a great and powerful tool, but my understanding is when it comes to correcting any errors that you find that's quite difficult to do with a screen reader. What I would suggest actually is that you just do a search because you may find that a lot of the basic things that are in that textbook are already available online. You could go to abcnotation.com or MuseScore, which is spelled M-U-S-E-S-C-O-R-E for the names of the simple tunes in the textbook and see how many you can find there. They're likely already to have been digitized on those sites.
You could use AI for simple transcription. For the unique textbook exercises, you could take a clear photo and pass it to an AI description tool like Be My Eyes or Aira's Access AI, or you could use one of the mainstream solutions and give an explicit prompt like please transcribe this simple sheet music into standard ABC notation text. And once you have what you're looking for, you could paste the ABC notation into an accessible web-based player like the online tools at abcnotation.com that will allow you to hear the MIDI playback, so that will be useful. There is a tool called Goodfeel, which is developed by Dancing Dots that may be more than you're looking for, but if you're looking for a blindness specific solution, do check out Goodfeel Dancing Dots do some fantastic work and they are specific to the blind community. Others may well have suggestions for you. We look forward to getting them and happy clarinet tooting.
That concludes this episode of Access On the Technology Podcast of the National Federation of the Blind. To send in a contribution for a future episode, email us, attach an audio clip or just write it down and send it to [email protected]. That's [email protected]. To keep up to date with Access On, follow us on Mastodon, [email protected]. That's [email protected] on Mastodon. To subscribe to an announcement only email list about upcoming episodes, send a blank message to [email protected]. That's [email protected]. To learn more about the National Federation of the Blind, visit our website nfb.org or phone us 410-659-9314. That's 410-659-9314. And be sure to check out the Nation's Blind Podcast right from where you heard this podcast.