Improving the quality of spoken text on the web

The W3C Accessible Platform Architecture’s Spoken Presentation Task Force is working to develop a standard mechanism to allow authors to control how web content should be presented via text to speech synthesizers (TTS) used by assistive technologies (such as screen readers). Other beneficiaries of this work would be the voice assistants, such as Amazon Alexa and Google Home.

One simple example can be seen with the following sentence:

According the 2010 US Census, the population of 90274 increased to 25209 from 24976 over the past 10 years.

Without presentation control, the zip code 90274 is read as ninety thousand two hundred and seventy four.

With SSML markup, the author could specify that the zip code be read as digits.

According the 2010 US Census, the population of <say-as interpret-as="digits">90274</say-as> increased to 25209 from 24976 over the past 10 years.

I will repeat the sentence once again, below, using in-line SSML. This version has the actual SSML markup shown above embedded in the HTML. Consider it an experiment as we explore methods by which the SSML can be consumed.

According the 2010 US Census, the population of 90274 increased to 25209 from 24976 over the past 10 years.

Now once, more, with the SSML say-as embedded in the HTML using a data-attribute, which looks like this: <span data-ssml='{"say-as" : {"interpret-as":"digits"}}'>90274</span>

According the 2010 US Census, the population of 90274 increased to 25209 from 24976 over the past 10 years.

Leave a Reply Cancel reply