[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [gnuspeech-contact] Re: Thoughts on GNUSpeech and possible accessibi
From: |
David Hill |
Subject: |
Re: [gnuspeech-contact] Re: Thoughts on GNUSpeech and possible accessibility applications |
Date: |
Wed, 8 Apr 2009 16:05:02 -0700 |
Hi Jason,
On Apr 7, 2009, at 9:06 PM, Jason White wrote:
David Hill <address@hidden> wrote:
You may be interested to check out the paper that describes the
"Touch 'n
Talk" system that Dalmazio mentioned in an earlier email. The direct
link is:
http://pages.cpsc.ucalgary.ca/~hill/papers/ieee-touch-n-talk-1988.pdf
it is an item on my university web site to which you can also
navigate.
Thank you, David, for the reference. This is an interesting paper.
It also
reminds me of a related solution, developed at approximately the
same time, by
Jim Thatcher for IBM Screen Reader, in which a separate key pad was
used for
reading and navigational functions. Although I never had an
opportunity to use
it, I recall that one of the principal difficulties in the early
versions was
said to be that the system wouldn't automatically read new text
presented on
screen, or read text in response to cursor movement - the users had
to switch
frequently between the qwerty keyboard and the screen reader's key
pad while
interacting with the application software.
This was not a limitation with Touch-'n-Talk which was designed to
integrate control and access within a single haptic-auditory
interface in as natural a way as possible. We made a direct
comparison between our system and a conventional key-operated talking
terminal and our target population of blind users preferred the
"Touch-'n-Talk" system, and the results for all users (there were
five blind subjects and twelve sighted but blindfolded subjects) were
comparable, which was a useful result since it implies that tests
using blindfolded subjects can be used, at least in exploratory
evaluations.
The research in which I, personally, find the most insight is that
by T.V.
Raman, first in his AsteR software (Audio System for Technical
Readings:
http://www.cs.cornell.edu/home/raman/) and then in Emacspeak
(http://emacspeak.sourceforge.net/ and for the latest source code,
http://emacspeak.googlecode.com/).
I checked out some of Dr. Raman's examples -- apparently using
DECTalk. Audio formatting has the advantage that no special
equipment is required.
But imagine if you could access those mathematical formulae by using
your finger, and being able to feel the spatial relationships between
the components with your hand and fingers, checking individual
characters and manipulating a mark as well as a cursor. "Touch-'n-
Talk was deliberately designed to make spatial cues available with
the need for sight, and to allow bookmarks, words, paragraphs, pages,
spelling, searching and so on to be handled easily.
Emacspeak works best with synthesizers that allow changes to be made
dynamically to voice characteristics, for example the DECTalk, and
it would be
interesting to know whether GNUSpeech might eventually support such
audio
formatting techniques.
Having an articulatory synthesiser means that many different voices
can be created dynamically, from child voices through female, to
male. Having said that, not all voice characteristics are well
understood, and not just for excellent female voices.
Further, in his latest work at Google on the accessibility of mobile
telephones, Raman has devised a means of making touch screen input
achievable
in an "eyes-free" context.
Being not just "eyes-free" but providing equivalent facilities using
touch and sound were basic design criteria for "Touch-'n-Talk" as you
obviously realise from reading the paper.
I trust that this digression into speech interface research is not
unwelcome
on the list; to ensure that it remains on topic, I have sought to
connect it
to functional requirements of a text to speech system.
I think there is a need for a free (as in freedom) tts system
capable of
supporting the products of past and current speech interface
research, while,
just as importantly, providing opportunities for future research
and free
software development efforts.
I also agree with David's observation that many of the most important
requirements are already treated in his 1988 paper, although
advances such as
Raman's "audio formatting" techniques create additional, desirable
features.
I am not sure what was missing compared to Raman's approach. If you
have spatial references through touch, the changing pitch is more of
a distraction that a help, IMHO.
Warm regards.
david
-------------