In the last 25 years, during which I have worked for the PATINS Project, assistive technology has grown by leaps and bounds. Today I am specifically considering one technology and how it has advanced greatly.
It is interesting and somewhat exciting to see where it was, and where it is going. My early involvement with text to speech (TTS) was with the software program Kurzweil 1000.
The software, when fitted to an appropriate computer configuration, would take scanned text and through the programs optical character recognition (OCR) would convert the text output to speech.
Kurzweil 1000 was primarily used by individuals who were blind or had low vision. Although others began using it for students who had a reading disability. From that enlightening came the Kurzweil 3000 program which addressed the other needs of not just reading but writing and study skills.
There have been many other text to speech programs developed. Some being Natural Reader, W.Y.N.N., Word Q, TextHelp Read and Write, Microsoft Narrator, Snap and Read to name a few.
These programs have had a major impact on struggling readers and those individuals who can’t access text in the traditional way.
For many users of TTS, one complaint that crossed programs were the robotic voices which were synthesized and lacked inflection and other natural nuances of human speech.
Not only was TTS used in software programs, but it was and still is a vital component in Augmentative and Alternative Communication (AAC) devices, software and applications.
Although TTS was/is an integral part of assistive technology for individuals to communicate and interact, it was just a matter of time before it became mainstream.
Very few people would know the background of TTS or its evolution of augmentative speech when using SIRI or Alexa. They have become a fixture in everyday life from young to old. Their voice sounds realistic, and the Artificial Intelligence (AI) used makes them almost lifelike.
What got me thinking about what my early years’ experience with TTS is a program my wife, Rita, came across a few weeks ago. The program is Speechify and it is TTS program and much more. Speechify is a text to speech program for desktop or mobile devices that uses computer generated voices.
This is one of many that have incorporated OCR to translate its output to speech. What is interesting about Speechify is that it doesn’t use voice files that are part of the devices operating system but generates speech using its own file sources.
You can choose voices from fourteen different countries, including Spanish, Chinese, French, Portuguese, Hindi, Dutch, Japanese, Arabic, Italian, German, Hebrew, and others. It offers male and female voices for the specific language, but it also has the voices of Gwyneth Paltrow and Snoop Dogg.
This is not an endorsement for Speechify (for which there is a cost to use). This is one view for me of where TTS started, and what is possible now. The advancement is phenomenal and Speechify is just one of many TTS programs out there.
The main reason Speechify caught my attention was Snoop Dogg’s voice, you should demo him. What a hoot!