Intro to Cross Screen Reader Testing
Hey everyone.
Thanks so much for watching this.
I'm really excited to be here.
Today, we're going to be going beyond screenreader 101 and diving into cross screen reader testing, or trying out your website with all the different options and types of screen readers out there.
Maybe you've already spent some time getting familiar with VoiceOver on your Mac, or you've installed NVDA on your PC.
My goal today is to get you comfortable with trying everything else.
To give you a brief overview.
I'm going to first cover some of the reasons for testing with multiple screen readers before taking a detailed look into some of their bigger differences.
Finally, we're going to jump right in and try to build some of your screen reader testing skills through real examples.
As a quick note, I should be watching along with you in the chat, if not, I'm probably asleep.
So please ask me anything along the way.
I'm guessing it's not too hard to convince you why testing with multiple screen readers is important.
It's the same reason why we test the multiple browsers.
We want our websites to work for everyone.
There's another reason that I've heard that I think often causes confusion and that's that testing with screen readers is a requirement for WICAG compliance.
Technically that's not true.
You can fully test WICAG compliance without ever opening a screen reader.
But as Adrian Roselli, who's talking overlays, I'm excited to see, tweeted earlier this year accessibility testing goes beyond WICAG compliance.
To me, cross-screen reader testing is all about trying to find and eliminate as many potential barriers as possible.
I want my code to be compatible with a variety of screen readers and other assistive technologies.
So it can be used by as many people as possible.
And on a scale of one to 10, how likely is it that your site has cross-screen reader bugs?
You know, what are the chances that you spend all this time watching this talk and learning how to cross screen reader test only to find that you don't actually have any issues.
Well, I'd say it's at least as likely as you having cross-browser bugs, maybe a little bit more.
And my intuition there comes from understanding what complex pieces of software screen readers are, as well as how much less familiar most developers are with the differences between screen readers, compared to browsers.
Most of us have been slowly accumulating cross-browser knowledge throughout our careers, but now we're just learning about screen readers for the first time.
And that's a lot to catch up with.
And when I say screen readers are complex software, I mean like web browser complex.
I'm sure you've stumbled across a W3C, HTML or CSS standards discussion on Github before.
There's nothing quite like an issue about a hundred replies deep with every reply containing a tiny little essay to help you appreciate just how many edge cases browsers have to deal with.
This is a screenshot from NVDA's Github to illustrate my point.
It's one of over 2,500 open issues right now.
And we're 73 comments deep into a discussion that started in 2013, just to try to improve the pronunciation of NVDA's announcements based on the Unicode characters in the text.
It's complex stuff.
And Remember screen readers rely on browsers to implement accessibility correctly as well.
So a browser bug can easily be a screen reader bug.
If you're curious about how that API stack works, check out Adem's "Accessibility API is where the magic happens" talk.
Oh, did I mention screen readers also have to support all the native apps on the operating system and the operating system's UI itself.
All that's to say screen readers have a hard job and there's a lot that can go wrong.
So they're bound to have differences and quirks.
It's worth learning about those so we can try to handle them.
So without further ado, let me introduce to you the cast of today's talk .NVDA, VoiceOver JAWS and TalkBack, are some of the most popular screen readers in use today.
And a special shout out to Narrator over there in the bottom, right.
It's the built-in screen reader for windows.
And it isn't always a crowd favorite, but it comes in handy, especially if you need to install NVDA.
And not pictured is Orca, which is Linux's screen reader and Chrome box, which is built into Chrome OS.
NVDA has been with us since 2006 and as of 2019 claimed over a hundred thousand users, it's also the only open source screen reader on this slide.
And you can support them by donating on their website.
Apple's VoiceOver, which first came out from Mac OS and later iOS has been around since 2005.
And being the only option on those operating systems, it has quite a bit of usage, especially on iOS.
JAWS, which stands for job access with speech came out in 1995.
And finally, Google's TalkBack for Android, which I'm not exactly sure when it was released, at least as far back as 2012, if you know, please type in the chat, but it's quickly maturing.
It's now at version nine.
Just like browsers, screen readers have a rich history.
Some were the Netscapes, others persevere, but have very low usage.
While there's a long history of browsers sharing their user agent string freely, whether or not someone is using a screen reader and which one they use is pretty sensitive information.
You could use that as a proxy to whether or not they have a disability, and that's just not something that should be exposed with every HTTP request.
Luckily, there's still a few options that can help us prioritize while still preserving privacy.
First, start with your users.
Have you gotten any bug reports from screen reader users?
Or maybe you've done some user research.
Second research your target market.
Are they more likely to use desktops, mobiles?
What operating systems are they likely to use?
And third, WebAIM, a globally recognized web accessibility non-profit conducts an annual screen reader survey.
This year it received over 1500 anonymous, valid responses.
The entire WebAIM survey is interesting read, but the responses to what's your primary desktop or laptop screen reader is probably the most often cited answer.
This year JAWS eclipsed NVDA with 53.7% of responses.
Last year, NVDA actually had a slim lead.
And Mac OS VoiceOver comes in third was 6.5%.
And note that this is only for desktop.
Mobile has its own section where iOS VoiceOver claims the lion's share 72%.
So now we'll cover some of the differences between popular screen readers.
To start cross screen reader testing you need to know how to properly operate each one.
They all work a little bit differently, but there's definitely some similarities.
The biggest split is whether it's a mobile or desktop screen reader.
Mobile primarily uses swipes, taps, single and multitask gestures, as well as clever use of the hardware buttons, like the volume up, down, and touch to explore where you can drag your finger around the screen to read what's underneath it.
iOS, VoiceOver, and Android TalkBack work basically the same way.
They just have different gestures to remember.
In contrast, laptop, desktop screen readers all make incredibly extensive use of keyboard shortcuts, but I think it's useful to split them into two categories.
Those with a browse mode by default, and those without.
Browse mode, also known as virtual mode is conceptually very similar to how mobile screen readers work, but instead of intercepting all touch input, screen readers in browse mode, intercept all keyboard input.
For example, the H key will jump heading by heading on a page.
Arrow down reads the next line of text instead of scrolling the page down a bit.
Space and enter are still used to invoke interactive controls, but that's done through the operating system accessibility APIs, not through key up and key down events sent to the browser.
Browse mode screen readers have a second mode called forms or focus mode, which does send keyboard input through to the browser.
They usually play a sound or "earcon" when switching between the modes.
And often automatically switch between browse and forms mode depending on whether or not focus is on a control that you can type into.
In contrast, the Mac OS VoiceOver.
It doesn't have a browse mode by default, and it requires you to hold down the VO keys, control and option by default, to send commands to the screen reader.
And fun fact, Narrator used to be over there with VoiceOver it's only in the past few years that narrator added a browse mode in response to user feedback.
So these things can change.
Armed with that mental model, I hope you have a much easier time getting started with each one.
webaim.org has a great collection of free articles, that document the basic keyboard shortcuts and gestures, and use that as a kind of cheatsheet.
WebAIM is usually one of the top hits if you search NVDA keyboard shortcuts or VoiceOver gestures.
Okay.
Enough background, let's dive into some examples.
Imagine you're sitting at your desk after seeing this talk and you're firing up some screen readers for the first time.
What are some of the first questions you're likely to have?
You'll immediately notice that each screen reader sounds a bit different.
Each one uses a different default TTS or text to speech engine and some sound more robotic than others.
Here's VoiceOvers default TTS named Alex [Screen reader reads] heading level one.
Hello world.
And now here's JAWS's default TTS eloquent [Screen reader reads] Heading level one Hello world.
Given the rise of very realistic voices like Siri and Alexa, you might wonder why it sounds so robotic and if that's any indication of the screen reader's quality.
The shorter answer is no screen reader users prefer voices that can be understood at very high speed rates.
Samuel Proulx who's also speaking often kicks off demos by blowing off everyone's socks with his default speech rate, which is all the way up.
Léonie Watson also has a super in-depth blog post that goes into the various TTS approaches out there.
If you want to take a deep dive.
Every screen reader supports multiple voices.
So try different ones out.
Just know that realism is always the best quality to judge them by, which is a good segue into pronunciation.
Maybe one of the most common issues that comes up when cross screen reader testing is that VoiceOver pronounces a string one way, while NVDA says at some other way.
It's important to know that screen readers and TTS engines are actually separate technologies.
The screen reader is mostly responsible for generating strings of texts that are fed to the user selected TTS.
It's largely up to the TTS, whether to spell out a word in all caps, or to read a phone number as some sort of very big numeric value.
Try not to get hung up on the pronunciation differences.
There's just not that much we can do about it on the web right now.
There is a W3C working group to specifically address this issue that I hope will give us some more options in the future.
And I know, I know it can be super painful to hear your company's name butchered, for example, this is how VoiceOver pronounces Deque, which is a leading accessibility company [screen reader reads] heading level one about D Q incorporated and now, NVDA [screen reader reads] heading level one about Dick systems incorporated that said you might be tempted to try to use an aria-label attribute filled with a phonetic spelling to force the TTS to say the right thing, but please don't do this people.
Too often forget that now users with a refreshable braille display hooked up to the screen reader will find your phonetic spelling instead of the actual word.
And there's really no way for them to get at the correct spelling.
NVDA has a virtual braille display viewer that displays exactly what would be sent to the external braille display that you can use to check for yourself.
Here's what that looks like.
[screen reader reads] Notification, overflow.
NVDA menu.
Tools menu, braille viewer B.
JS Bin Mozilla Firefox JS bin document, heading level one, DQ at best the user learns the wrong spelling of your brand name.
At worst they're terribly confused.
Remember that a TTS isn't the only way that users experience a screen reader's output.
Maybe the next most common issue is discovering differences in rendering or how the screen reader decides to transform UI elements on your site into speech strings.
Here's a fairly famous example that involves styled unordered lists.
First VoiceOver [VoiceOver reads] heading level one departments, link kitchen, link, bed and bath.
And now NVDA [NVDA reads] heading level one department's list with five items, link kitchen link bed and bath, did you catch the difference?
They're the same, except that NVDA conveys that the kitchen link is in a list with five items, but VoiceOver completely omits that information.
To understand why VoiceOver and NVDA are different we need to understand that the screen reader, rendering UX is constantly evolving and screen readers actually compete with one another based on that rendering UX.
This is a stark difference from web browsers.
The web platform is also constantly evolving, but there are extensive specifications from the W3C about how HTML and CSS should be visually rendered that each browser conforms to.
Yes, browsers disagree a bit about the defaults, but that's why we have CSS resets.
Forget all of that when it comes to screenreader rendering.
Each has wide leeway to decide what's best for its users.
That can include omitting or including extra information.
The terminology they use.
For example, VoiceOver calls, images, images while NVDA calls them graphics and the order in which things are announced.
For example, kitchen link versus link kitchen.
With screen readers, there is no equivalent of a CSS reset.
I remember when I first learned about this, my knee jerk reaction was to reach for my pitchfork.
You know, there should be consistency, this isn't right.
I want to try to persuade you to set down your pitchforks and embrace the differences.
Screen readers are all about providing access, right?
Helping people with disabilities, get things done when something doesn't work well with a screen reader, whether it's the screen reader's fault or the browser's fault or the developer's fault, it doesn't matter to the end user-they're blocked.
And that sucks.
Competition is a powerful force.
It's certainly not universally good, but at its best that can drive some wonderful innovation.
And when screen readers innovate, especially on their UX, their users win.
To put that in a stronger example, JAWS is a paid screen reader that has a hard earned reputation of being worth every penny, by going above and beyond to deliver the best possible UX.
You might be able to catch a serious accessibility issue in this code snippet, the label isn't associated with the input.
It needs a 'for' attribute pointing to the input's ID.
When tabbing to this input with most screen readers, the input will be announced without a label, which is a terrible UX, but not JAWS.
JAWS applies some clever heuristics to figure out that the label right next to the input is probably the input's label.
Could JAWS's heuristic make a mistake?
Absolutely.
And that's frustrating as a web developer, but they prioritize their users' needs here.
And given how prevalent simple mistakes like this are on the web today I think they're making the right.
I hope I've convinced you not to worry about the rendering details too much, but you still might be having a hard time with VoiceOver not announcing list with five items.
Maybe you have a situation where that information would be extremely helpful or even user feedback.
Under the hood, the example site was using semantic tags, UL and LIs, and VoiceOver does convey the list semantic the list, length and position for vanilla bolded lists.
So why didn't it?
It turns out VoiceOver has a heuristic that checks to see if CSS list-style-none was applied.
Apple isn't always the most public about their decisions.
But in this case, James Craig Apple's accessibility standards manager shared that their users were overwhelmed by an overuse of semantic lists on the Web.
I can imagine how hearing list with five items over and over and over again, could get on my nerves, but this is just Apple's take other screen readers disagree.
I support Apple here.
My one critique is that they should be more public about documenting decisions like this.
So web developers have a clear picture about what's expected and what's not.
By the way, it is possible to override VoiceOver's heuristic with some RIAA by applying role equals list to the UL.
But please carefully consider that decision.
This is a great situation for usability testing.
The third, most common confusion I see when folks start cross screen reader testing is around how to properly operate them.
We know that some rendering differences betwween screen readers are to be expected.
But what about the bigger stuff?
Like whether or not you can actually interact with a control or how focus management works?
In cross-browser testing if you have the same webpage and you perform the exact same set of actions on it in two different browsers and in each browser, you get a different result you've either discovered a browser bug or a browser quirk.
And this is because browsers all share the same interaction model.
The mouse works in the same way, left click to select right click to pull up a context menu, your keyboard's arrow down key fires a key down event in every browser.
Tab always moves focus between interactive controls.
Okay.
Okay.
Maybe I want to get too far without one, I'm looking at you Safari and that's kind of an accessibility in joke for those that don't know-tab doesn't work out of the box with Mac OS or Safari, unless you change some default settings.
Why Safari chooses to be different than everything else out there is a mystery to many, but anyway, just remember that.Unlike browsers, screen readers have different interaction models.
You can't always perform the exact same keyboard shortcuts in two screen readers and expect similar results.
But it is super easy to forget that here's a quick example.
GitHub as a dropdown menu when you click your profile icon, it's keyboard accessible.
You can tab to it, you space or enter to open it and then use the upper down arrow keys to move focus between options.
Focus is properly managed.
It restarts at the top once you reach the bottom and escape closes it.
This works the same way in every single browser.
Now let's see how it works in VoiceOver.
[VoiceOver reads]View profile and more.
Menu pop-up button.
Menu, 15 items signed in his Western Thayor.
Interactive set status Interactive.
Your profile.
Your repositories.
Your code spaces, your organization, your projects, your stars, your just upgrade feature, preview, help settings, sign out, sign in as Western set status, your profile...
[Weston] So you can tab to it and announced as a button that could be expanded.
You can open it with space or enter.
And as you use the arrow keys, each item is announced.
Focus management also works.
It wraps around.
Awesome.
Let's try it again with NVDA.
[NVDA reads] Clickable view profile and more button collapsed.
Sub menu expanded menu menu items signed in as Weston Thayor.
Clickable menu item, menu, item your profile.
[Weston] So you can tab to it, and announced as a button that could be expanded and you can use the arrow keys to move between each item.
Let's keep going.
[NVDA reads] Menu item, menu, menu, menu, menu, menu, menu, menu, menu, item, menu, item, menu item Sidgn out of menu, main landmark heading level, one extent, Githun [Weston] Catch that?
We press down arrow on the last item in the menu and focus actually moved outside of the dropdown and onto the page behind it.
That means our focus management code wasn't working right.
Is this just a quirk of NVDA or maybe it's VoiceOver that has the quirk?
No, there's no quirks.
If we think back to the different categories of screen readers, remember the NVDA uses a browse mode/focus mode paradigm while VoiceOver doesn't.
If we were to set a break point on the drop-down's focus management code, we wouldn't hit it with NVDA and that's because NVDA is in browse mode.
It's intercepting the arrow keydown and keyup events and interpreting them as commands to move its virtual cursor on the page.
VoiceOver doesn't have a browse mode.
The key events are going to the browser and your breakpoint would be hit.
We aren't doing an apples to apples cross screen reader test here.
With VoiceOver, we're sending native keyboard events to the browser and VoiceOver is following along with the browser focus.
With NVDA we're in browse mode and the arrow key is using the arrow keys to move NVDA's virtual cursor.
It would be more apples to apples if we either forced NVDA into its forms mode so that native keyboard events were fired, or if we held down the VoiceOver keys control plus option with VoiceOver to move its own virtual cursor.
But let's zoom out here because I think this is a super common pitfall.
Our goal with cross screen reader testing is to try to find barriers that real screen reader users would hit.
So we should be designing our test cases based on how a real screen reader user would test.
It's too easy to skip over this step and assume that we're using the correct commands.
Remember screen readers are complex and they also have a complex learning curve.
Your early experiences, as a sighted developer are usually not representative of the experiences of more experienced, low vision and blind users.
When designing a dropdown menu like this, we need an understanding of how an NVDA user would expect to interact with it.
And a voiceover user would expect to interact with it.
That takes some research and experience and often exposes issues with our component's design.
One resource that can help is the ARIA authoring practices.
It's a set of example, patterns built with ARIA from the W3C WAI-ARIA working group.
They aren't guaranteed to work in all screen readers, but a lot of thought has gone into them.
If we recall how NVDA announced items in the dropdown, it said menu and menu item, those aren't ARIA labels.
That's how NVDA was announcing their ARIA roles.
You might've read elsewhere that an ARIA role was like a contract.
If you put role equals button on something, that's telling the screen reader that this element walks talks and acts like a button.
It's your job to uphold your side of the contract and make sure that you can click that button, tap it, tab to it, use space and enter with it.
ARIA authoring practices has an example for the menu and menu item roles that are shown with a button.
It's called the menubutton example, and it specs out the expected keyboard interactions associated with it.
In most cases, the authoring practices didn't make up their own keyboard interactions.
Rather.
They carefully based them on OS native controls that have the same role.
If we tried out a need of applications menu with NVDA, we'd find that focus wrapping when pressing the arrow keys is a core expected behavior of the menu and menuitem role.
Another easy way to do this is to try the ARIA authoring practices example with NVDA.
[NVDA reads] Actions, menu, button, sub menu, expanded, actions menu.
Action one, one to four.
Action, two, two of four.
Action three, three of four.
Action four, four of four.
Action one, one of four.
[Weston] Notice how focus wrapping works.
There's also a typewriter sound, which is NVDA's earcon for switching into focus mode.
And that allows the dropdowns core focus management code to run.
This means that there isn't an NVDA quirk or a VoiceOver quirk there's a bug in Github's dropdown.
By backing up and getting a clear understanding of how a screen reader user would expect to interact with the control, cross screen reader testing becomes easier and less confusing.
Familiarizing yourself with the ARIA authoring practices is one way to do that, but there are many others like running usability tests or interviews with real screen reader users.
There are many great blog posts out there from accessibility professionals that debate the best pattern for the job.
Here's a highly relevant one from Adrian Roselli that goes into the UX trade-offs for navigation drop-downs on websites.
Github's dropdown is full of links that navigate elsewhere on the site.
Is acting like a native app menu, which usually contains commands the best choice?
Okay.
I think that covers the most common questions you'll have when you get started.
Now, I'm going to take a whirlwind tour through some of the smaller quirks that trip you up.
First up focus synching.
Every screen reader lets you navigate to non-interactive elements like headings and paragraphs through the way of a virtual cursor.
The virtual cursor focuses different elements on your page, but this is different from browser focus or a document.active element.
Some screen readers sync the browser focus with the virtual cursor when they can, others don't.
NVDA used to keep focus synched, but recently turned this off because of performance issues that arise waiting for the browser to finish moving it's focus.
IOS VoiceOver has flip-flopped several times before settling on not synching focus in iOS 15.
This is most often noticed with skip links that are visually hidden until focused.
As long as they still work while visually hidden you'll be fine, but if you're enabling some functionality on the focus event and disabling it again on blur, it will break with some screen readers.
This screenshot shows iOS VoiceOver's virtual cursor on the skip link.
It's okay that the skip link is still visually hidden, just as long as you could still invoke it with with VoiceOver's double tap gesture.
Next up is user configured screen reader settings.
This is sort of a self-induced quirk.
Screen readers are incredibly configurable.
They have a setting for pretty much everything.
It's all too easy to change a setting for some testing, and then completely forget that you changed it when you come back to test weeks later.
This is doubly true when sharing test devices.
My advice always reset to the factory defaults.
And finally some browsers like Chrome and Firefox do not construct their internal accessibility trees until a screen reader or some other assistive technology connects them via the operating system accessibility APIs.
This is for performance, the browser accessibility tree needs to be maintained alongside the DOM.
So it's not free.
Historically there's been bugs or the browser doesn't correctly construct its accessibility tree while after being open for awhile.
So it's always a good idea to fire up the screen reader before the browser, which most, most real world users would do anyway.
As far as I know this, isn't an issue for Safari.
So it may catch those of you who've have been exclusively testing with voiceover on a Mac 'til now off guard.
Okay.
A little bonus section to wrap things up.
I wanted to share some tools and resources you can use while cross screen reader testing.
First up one approach I often use when deciding when to cross screen reader test is to think about the support for the HTML, CSS.
JavaScript features that I'm using.
In terms of cross browser testing if I'm building something that uses the new clipboard API, I'll usually consult canIuse.com to get an idea of which browsers it might not work so well in?
And then I'll conduct more rigorous manual testing in those browsers.
Did you know there's a similar website for screen readers?
It's called a11ysupport.io and a quick shout-out Michael Fairchild, who started it is looking for help maintaining.
Similar to, canIuse it can be a great resource to figure out if there might be some spotty screen reader support for some of the HTML or ARIA you're using.
You might even be surprised to find that web platform features that have been around for a long time, like data lists don't work well everywhere.
Second, know where to get help.
There's a slack called web dash 11 Y for web accessibility, which has kindly sponsored by Slack, filled with thousands of professionals.
There's even been a hashtag bug fix channel specifically for accessibility bugs in browsers, screen readers and other assistive tech.
Unfortunately due to an influx of spammers, you need to request access, but if you ping me on Twitter, I'd be happy to send you an invite.
And third you'll need access to each screen reader you want to test in.
You have several options you can use or install the screen reader on the OS you already have, if that's supported or you could buy or borrow a computer or a phone that has the OS you need.
Or if you're on a Mac and you have enough disc space and Docker hasn't eaten all of your RAM, you could create a Windows virtual machine.
It can be a bit tricky to get keyboard shortcuts working, but there's some great guides out there.
Android TalkBack can also be tested from the Android emulator, which installs with the Android studio dev environment.
Unfortunately, creating a Mac OS VM on Windows is illegal and the iOS simulator does not currently include VoiceOver.
And finally cloud testing services let you test instantly from anywhere.
BrowserStack recently added support for turning on VoiceOver on Mac OS and Narrator on Windows.
And this is a bit of a shameless plug, but the tool I'm working on right now, Assistive Labs is currently the only cloud service that also supports and NVDA and JAWS.
Mac OS VoiceOver and Android TalkBack are in early access.
If you're interested in those, please reach out.
I'd love to talk.
And we're done.
Thank you so much for watching.
I hope this has been useful and inspired you to try testing in different screen readers.
I'd love to answer your questions in the chat or on Twitter at @WestonThayer5, you can also follow along with what I'm up to at assistivlabs.com.
That's assistive without the E labs.com.
Thank you so much and see online.