gnuspeech-contact
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gnuspeech-contact] Re: Thoughts on GNUSpeech and possible accessibi


From: David Hill
Subject: Re: [gnuspeech-contact] Re: Thoughts on GNUSpeech and possible accessibility applications
Date: Wed, 8 Apr 2009 16:05:02 -0700

Hi Jason,

On Apr 7, 2009, at 9:06 PM, Jason White wrote:

David Hill <address@hidden> wrote:

You may be interested to check out the paper that describes the "Touch 'n
Talk" system that Dalmazio mentioned in an earlier email.  The direct
link is:

http://pages.cpsc.ucalgary.ca/~hill/papers/ieee-touch-n-talk-1988.pdf

it is an item on my university web site to which you can also navigate.

Thank you, David, for the reference. This is an interesting paper. It also reminds me of a related solution, developed at approximately the same time, by Jim Thatcher for IBM Screen Reader, in which a separate key pad was used for reading and navigational functions. Although I never had an opportunity to use it, I recall that one of the principal difficulties in the early versions was said to be that the system wouldn't automatically read new text presented on screen, or read text in response to cursor movement - the users had to switch frequently between the qwerty keyboard and the screen reader's key pad while
interacting with the application software.

This was not a limitation with Touch-'n-Talk which was designed to integrate control and access within a single haptic-auditory interface in as natural a way as possible. We made a direct comparison between our system and a conventional key-operated talking terminal and our target population of blind users preferred the "Touch-'n-Talk" system, and the results for all users (there were five blind subjects and twelve sighted but blindfolded subjects) were comparable, which was a useful result since it implies that tests using blindfolded subjects can be used, at least in exploratory evaluations.


The research in which I, personally, find the most insight is that by T.V. Raman, first in his AsteR software (Audio System for Technical Readings:
http://www.cs.cornell.edu/home/raman/) and then in Emacspeak
(http://emacspeak.sourceforge.net/ and for the latest source code,
http://emacspeak.googlecode.com/).

I checked out some of Dr. Raman's examples -- apparently using DECTalk. Audio formatting has the advantage that no special equipment is required.

But imagine if you could access those mathematical formulae by using your finger, and being able to feel the spatial relationships between the components with your hand and fingers, checking individual characters and manipulating a mark as well as a cursor. "Touch-'n- Talk was deliberately designed to make spatial cues available with the need for sight, and to allow bookmarks, words, paragraphs, pages, spelling, searching and so on to be handled easily.


Emacspeak works best with synthesizers that allow changes to be made
dynamically to voice characteristics, for example the DECTalk, and it would be interesting to know whether GNUSpeech might eventually support such audio
formatting techniques.

Having an articulatory synthesiser means that many different voices can be created dynamically, from child voices through female, to male. Having said that, not all voice characteristics are well understood, and not just for excellent female voices.


Further, in his latest work at Google on the accessibility of mobile
telephones, Raman has devised a means of making touch screen input achievable
in an "eyes-free" context.

Being not just "eyes-free" but providing equivalent facilities using touch and sound were basic design criteria for "Touch-'n-Talk" as you obviously realise from reading the paper.



I trust that this digression into speech interface research is not unwelcome on the list; to ensure that it remains on topic, I have sought to connect it
to functional requirements of a text to speech system.

I think there is a need for a free (as in freedom) tts system capable of supporting the products of past and current speech interface research, while, just as importantly, providing opportunities for future research and free
software development efforts.

I also agree with David's observation that many of the most important
requirements are already treated in his 1988 paper, although advances such as Raman's "audio formatting" techniques create additional, desirable features.



I am not sure what was missing compared to Raman's approach. If you have spatial references through touch, the changing pitch is more of a distraction that a help, IMHO.

Warm regards.

david

-------------





reply via email to

[Prev in Thread] Current Thread [Next in Thread]