Stammering Accessibility and Testing for Voice Assistants & Devices

“Alexa, is it going to rain today?”

How long does it take for a non stammering person to say that?

For a person who stammers ( PWS )

  • Light Stammering : 5 – 15 seconds
  • Medium Stammering : 15 – 25 seconds
  • Heavy Stammering : 25 – 60 seconds

Here is what might surprise you

A Person Who Stammers oscillate between 

  • Light Stammering to Medium Stammering
  • Heavy Stammering to Medium Stammering. 

It changes as per 

  • The environment
  • The familiarity of the people
  • The context in which they are speaking
  • The level of consciousness about their own stammering
  • The way they feel intimidated
  • Speaking over phone
  • Speaking on stage 
  • Speaking with strangers
  • A job interview 
  • A conversation with opposite sex
  • Emotions and anxiety. 

How do I know?

Because I am one of them. I have been stammering right from the time I started to speak at the age of 3. I have shared my detailed story over here. In a nutshell – the story I shared is how I self helped and moved from being a Heavy Stammerer to Light Stammerer.

After I published my story, I got in touch with Dhruv who then introduced me to TISA (The Indian Stammering Association) and SHG (Self Help Groups). I have spoken at Meetups and Conferences of TISA in Mumbai and Bangalore which gave me the wonderful opportunity to interact with hundreds of People Who Stammer ( PWS ) and Speech Language Therapists ( SLP ). Even if you don’t stammer, you should attend this conference at least once in your life. It is the most inspiring and humbling experience I have had. Especially the kids who come there.

The many types and styles of stammering

Stammering / Stuttering / Speech Disorders can happen by 

  • Heredity
  • The way the brain was developed as a child
  • An emotional and traumatic event such as abuse
  • Due to physical injury to the head due to an accident
  • Stroke

So a person who doesn’t stammer today can stammer in the future and a person who stammers heavy today may move to light stammering. People can move from Light Stammering to Heavy Stammering or vice versa in their life. 

The different styles of stammering

Type

  • Repetition of letters
  • Repeated throw of air while uttering a word
  • Repetition of words
  • Long pause in between words
  • Shaking jaw without uttering a word

Words that start with (most commonly)

  • Sa
  • Ha
  • Aa
  • Va
  • Ha
  • Pa
  • Ra
  • Ta
  • Cha
  • Dha
  • Ho

Different places in the sentence

  • At the start 
  • In between the sentence
  • Intermittent
  • At the end
  • For every word

Speed of the sentence varying from

  • Stammer at the start – fast completing of sentence
  • Normal start of the sentence and a sudden stammer in between
  • At the end of the sentence

Amplitude of the speech

  • Varies
  • Can become suddenly low for some words
  • Can suddenly be high for some words

Hence, the normal voice and speech recognition may fail to detect what PWS speak.

Elon Musk Stutters

1% of the world population stutters. So does Elon Musk. If Tesla’s voice commands don’t work for Elon Musk, you can imagine how it would be for Elon to have built the car that embarasses him for his voice commands.

Not just Elon Musk. There are many geniuses in the world who were people with stutter or stammer as a part of their life. Aristotle was a stammerer. Alan Turing was a stammerer. Elvis Presley, to add to the crown. When I go through the list, it makes me feel special. Here are some of my favorite people who stammer :

You might have heard their speech and wondered if they really stammer? That is because as I mentioned earlier, the stammering varies. For instance, Hrithik Roshan’s stammering varies quite a lot. He moves from no stammering to medium stammering in between a sentence and returns to light stammering. 

Why am I happy that celebrities stammer? 

First – it gives hope to many people who think stammering is a blocker in life. I have seen it as an unfair advantage I have and so have many others. Second, for the point of accessibility, I am sure people building global products look up to top celebrities like Elon Musk, for inspiration and also as users of their products. 

Voice enabled world is intimidating to stammerers 

When I looked at Amazon Echo and Google Home it had no interface for me to type – I was like, “Wow, this thing is going to be tough on people who stammer”. A huge economy is being forecasted in the future based on a voice enabled world. Voice commands are everywhere today from our car to fridge to maps to apps.

Many analysts predict that the voice enabled assistant market will be 8 Billion USD by 2023. I use a voice assistant in my car. It has really helped quite a lot. I use it on the TV remote. We saw Sundar Pichai demo the Google Assistant answering calls and making appointments. A good friend and fellow start-up entrepreneur Kumar founded Slang Labs that specializes in adding voice to apps and they put up a Voice Enabled Search for Covid-19 cases in India

So, the voice is becoming real. Voice is going to take over everything we do. People are building products with just voice as an interface. Not just Amazon and Google but many devices are going to not have a GUI but just a voice assistant.

This is the world People Who Stammer will hate to live in unless they are being included into how these products are built.

Here is an example:

I intentionally stammered to ask “What is the time?” on Google Assistant

“The” needs air to be thrown out to make the sound “Dha”. I have had many issues saying “The” when it comes in between a sentence. So, here is what happened with Google Assistant.

Google cut me off after a while of my stammering and shouted (that is how it sounds to a stammerer), “Sorry, what are you asking?” This is rude on my face. This can make me stammer more because I feel shamed. I feel rejection. The more I feel ashamed, my stammering increases.  The accessibility enabled talk back can be – “If you could repeat, I will try to help you better”. It is less intimidating, has a service mindset, and polite.

Sometimes, I stammer to say “OK Google” or “Siri” or “Alexa” itself. 

Accessibility is new to this new technology of voice based assistants. I searched around the material on the internet and there aren’t enough. I found this blog post by a fellow entrepreneur Andy Theyers from the UK and thought I should share it with you. He mentioned this beautiful thing that it would be great to let people who stammer change Alexa to some other name to make it convenient. As I mentioned – it can be tough sometimes to say some words. 

During my childhood, “Sa” was very tough, especially when it was at the start of a sentence or a word. Saying “Siri” might be very difficult for many people. In the below video – you would hear from my friends. I mean, I don’t know them but if we stammer together – we instantly will develop friendship.

What does accessibility for voice enabled devices mean?

It means the ability to accommodate people with speech impediments and disabilities such as stuttering or stammering to be able to interact and act with voice enabled devices. 

While building this article piece by piece – I had the opportunity to connect with PWS and have used Voice Assistant asking them what kind of challenges they find in using the current day voice systems. They seem to have the following issues in the current day implementation of voice assistants and devices

  • Premature listening
  • Short listening times
  • Consistent incorrect results to intended words

Should there be an accessibility guidelines and standards for voice based systems the guidelines might incorporate or include

  • Configurable Invoking name : From Siri, Alexa, Google to a name PWS is comfortable with
  • Listening time : Customize the time to listen
  • Offer suggestions : Provide choices based on the interpretation
  • Gesture control : Add a gesture control to make life even easier
  • Polite talk back : A safe, non embarrassing, friendly talk back
  • Alternate Mode : A whistle or sing mode to help make faster choices
  • Talk back speed : Alter talk back speed to feel comfortable

Experience of a stammerer with accessibility enabled voice assistants would be like

PWS = Person Who Stammers, VA = Voice Assistant

The happy path scenario

  • PWS : “Hhhhhhhhhh   hhhhhhhhhheyyyy Lee”
  • VA : “Hey, Pradeep, I hope you are doing good today?”
  • PWS : “Sssss SSSSSSSS sssssssss   sssssss SSSSchhedule a meeting with Aaa AAamit ttt, tuttuu……. tu tu….tummaro ha ha ha hat for”
  • VA : “Sure, Pradeep, I am happy to help you.  Interesting you want to meet Amit. I believe this is the same Amit Prakash you met last month. If I heard it right, please whistle or clap to confirm”
  • PWS : <Claps>
  • VA : “Thanks for clapping. I have added a calendar event for a meeting with Amit
  •   Prakash tomorrow at 4 PM.”
  • PWS : <Claps>
  • VA : “Yay, you clapped. I love helping you, Pradeep. Please come back.”

The use of gestures makes it easy for PWS to acknowledge, instead of saying “Yes” or “Confirm”

The most likely scenario

  • PWS : “Hhhhhhhhhh   hhhhhhhhhheyyyy Lee”
  • VA : “Hey, Pradeep, I am happy you woke me up. What can I do for you today?”
  • PWS : “Sssss SSSSSSSS sssssssss   sssssss SSSSchhedule a meeting with Aaa AAamit ttt, tuttuu……. tu tu….pause for a few seconds…tummaro ha ha ha hat for thaty”
  • VA : “Sure, Pradeep, I am happy to help you.  Interesting you want to meet Amit. I believe this is the same Amit Prakash you met last month.”
  • VA: “If I heard it right, please whistle or clap to confirm”
  • PWS : <No clap>
  • VA : “I didn’t hear your clap. I am sorry, I got it wrong. Here are three other options of Amit” – “Amit Vyas, Amit Somani, Amit Shankar”
  • PWS : <No clap>
  • VA : “Looks like I got the name wrong. I am sorry, I am learning to help you. Please just say just the name if you can and this time I will do better”
  • PWS : “Huhhhhhhhhhhmeeet Pa pa pa pa pradhaan”
  • VA : “Ameet Pradhan. Thanks for helping me, Pradeep. Please clap to confirm”
  • PWS : <Claps>
  • VA :  “Love you, Pradeep”

Far less intimidating. Lesser talk. More gestures. Polite talk back. Less guilt for PWS plus VA seems to have life and is friendly enough to give comfort to PWS. A happier experience than the existing way the Voice Assistants are designed. 

Testing for Stuttering and Stammering Accessibility (when implemented)

Tech With Tools

Like accessibility scanners that exist today to test for Alt Text, a scanner can be built to assess code level implementation of accessibility for voice assistants and devices.

Testing with Voice Recording

Static testing can happen through voice recorded samples. Static testing can be driven through a speaker directed at a voice assistant or a device.

While that is possible, we need a good collection of voice samples. Here is a repository of voice samples from Anjan as a start for Google, Alexa and Siri. I am interacting with a few SLP’s to collect more PWS volunteers and voice samples. Anjan was happy to contribute and have his name mentioned here although there was an option open to not being mentioned.

This can offer a quick but not a fully accurate validation of the accessibility implementation. It can at least qualify to go to the humans for further testing.

Beta Testing with the help of humans

Since stammering is dynamic, humans need to be put in their zones to test the voice assistants. A company like Apple, Google and Amazon would have some PWS in their company. However, a larger sample can be gathered by collaborating with the respective local country Stammering Institute / Association.

Users may not be good at reporting bugs. Good news, I have come across quite a number of good software testers who stammer and can participate in such studies.

Unfortunately – bringing in humans with stammering and getting them to intentionally test it out might not help because I know a PWS who doesn’t stammer much in front of the device but when face to face with humans is heavy on stammering. So, picking a sample set of people who are good representatives of different stammering styles and letting them beta test in real conditions matter a lot to getting the accessibility right.

Barrier Break from Mumbai, India founded by Shilpi Kapoor recruits People with different disabilities (or should I say – smarter people than normal humans) and I have had the pleasure of visiting them, being blown away by the people and understanding their accessibility testing practices. They are a wonderful team to work with.

Arriving at the right sampling

A study can only be successful if it is done with the right set of samples. Since Stammering has a vast variety – the people of sample set should be across categories listed in the section The Many Types and Styles of Stammering above.

Scientific and Subjective Testing

In Multimedia space there are concepts of Scientific and Subjective Testing. This is metaphorically and literally applicable in the accessibility testing of stammering.

Scientific testing involves the analysis of Signal To Noise Ratio, Attenuation, Packet Drops, Jitter in Multimedia. Here in this context, a tool to assess the accuracy of listening, the number of iterations PWS goes through to achieve a desired result is possible.

Subjective testing in this context shall have a scoring sheet provided to the PWS to rate the system on how it behaved based on the criteria of accessibility implementation.

A PWS having beta tested the system could answer questions like : On a scale of 1 to 5, 1 being less helpful and 5 being very helpful, how polite was the talk back?

Data collected across the sample of PWS then can be corroborated to arrive at the usefulness of the implementation.

Depth of Testing

Shallow depth

This can be achieved through static testing and with a speaker setup. It can comprise of testing for basic functions such as:

  • Calling a contact
  • Setting up a reminder
  • Setting an alarm
  • Adding an event on calendar
  • Simple straightforward search

Medium Depth

This can also be partially achieved through the speaker setup however this requires a variety sample set and a bit of intelligent automation to vary the secondary response or more words that PWS has to speak, for operations such as:

  • Sending a message
  • Searching with multiple criteria
  • Searching for an address in maps

Deep

This can only be achieved through humans. There are complex operations such as

  • Booking an appointment
  • Drafting an email
  • Any operation whose context is dynamic

Innovation Possibility

Great things have been accomplished with software. Software that speaks is nearly half a decade old. We could think of a software that is configurable to produce a kind of intended stammer we want for a test.

Semi Final Thoughts

It is interesting to see how the world unfolds on the voice assistants. Given that the world is changing – we could see more devices with voice and less touch. Imagine walking into a lift and there are no buttons anymore. Non touch world is going to be accelerated because of Novel Coronavirus. A voice is the alternate everyone will be choosing.

As more people choose voice as an alternative to touch – PWS will face trouble and hence the need for stammering accessibility is coming much faster than it did. All of us might be testing or working on apps that will have voice. These thoughts might come in handy and voicey too.

If you are a software engineer who stammers, please note that there are plenty like you and are doing pretty good. If you or a person you know want to contribute to the repository of voice samples please send me your samples after going through what Anjan has shared. It is easy to find my email id or send a drive link to my Linkedin.

May we build, test and live in a world that includes everyone and their needs.

12
1 Shares:
You May Also Like