Re: [gnuspeech-contact] Understanding diphones.mxml and improving vocali

gnuspeech-contact

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gnuspeech-contact] Understanding diphones.mxml and improving vocali

From:	Nickolay V. Shmyrev
Subject:	Re: [gnuspeech-contact] Understanding diphones.mxml and improving vocalization quality
Date:	Mon, 05 Feb 2007 14:39:08 +0300

В Птн, 26/01/2007 в 07:09 +0000, Omari Stephens пишет:
> Hi, all
> 
> I'm part of a 5-person team at MIT that is participating in the class 6.189: 
> Multicore Programming Primer [1], a project based class in which we implement 
> a computationally intensive application on a parallel processor, the 
> PlayStation 
> 3's Cell architecture [2].  Put shortly, we are using gnuspeech as a 
> reference 
> implementation for a speech synthesis implementation on the PS3.
> 
> I'm currently working on a stripped-down, non-interactive analog of Monet 
> to generate postures for the tube.  It seems that everything I would need
>  for this is catalogued in diphones.mxml, but we're having trouble figuring 
> out
>  how to calculate the transitions (that is, we're unsure how to use the rules,
>  transitions, and equations sections).  Any specific help on this front or 
> pointers
>  to useful spots in the source would be tremendously helpful.
> 
> Additionally, other group members are working on finding, implementing,
>  and hooking up a more realistic vocal fold model.  From my own poking 
> around on the Internet, it seems that most of the models are two-mass models,
>  but I haven't read through anything in enough detail to know the differences
>  between them.  Is there a model someone would recommend that would likely 
> improve the vocalization quality but also could be coded in a reasonable 
> amount of time? (Hopefully a day or less)  We will probably implement this
>  in C or C++, and may put more hands on this part of the project if th
>  benefits merit that sort of attention.  Our final product is due this coming 
> Friday, 2 Feb.
> 
> Lastly, what other changes could we make to improve the vocalization quality
> ?  I had thought of perhaps emulating smoother transitions between the 
> different
>  vocal tract regions, but I know neither if this is feasible time-wise, nor 
> if it
>  will make an appreciable difference/improvement in output sound quality.
> 
> [1] http://cag.csail.mit.edu/ps3/
> [2] http://en.wikipedia.org/wiki/Cell_microprocessor
> 
> Thanks very much for your time and any help you all may be able to offer.
> --xsdg, for the 6.189 Speech Synthesis team
> 

Hello Omari

That's sad that mail was delivered with such delay :( Probably it's
important to be subscribed to the list before posting there.

For sure such work is very interesting. Do you have any news now? Did
you succeed to implement TRM-based synthesis? Friday is already gone,
probably you have some interesting results or even some code to share.

signature.asc
Description: Эта часть сообщения подписана цифровой подписью

[Prev in Thread]

Current Thread

[Next in Thread]

[gnuspeech-contact] Understanding diphones.mxml and improving vocalization quality, Omari Stephens, 2007/02/04
- Re: [gnuspeech-contact] Understanding diphones.mxml and improving vocalization quality, Nickolay V. Shmyrev <=

Prev by Date: Re: [gnuspeech-contact] TRM as backend for festival
Previous by thread: [gnuspeech-contact] Understanding diphones.mxml and improving vocalization quality
Next by thread: [gnuspeech-contact] Missing gorm files?
Index(es):
- Date
- Thread