TALKER CHARACTERIZATION THROUGH SYNTHESIS
Richard S Mcgowan, Manager
Sensimetrics Corporation
14 Summer Street
malden, Ma 02148
Grant 1R43DC003740-01 from National Institute On Deafness And Other Communication Disorders IRG: ZRG1
Abstract: The aim of this proposed research is the development of a method for characterizing the speech patterns of individual talkers. This characterization will be in the form of procedure for the adjustment of the parameters of a speech synthesizer so that it generates output that sounds like a given talker. The procedure will consist of two steps 1) automatic extraction of a set of acoustic descriptions from recorded utterances of the talker, and 2) adjustment of parameters and rules on a synthesis-by-rule system based on the extracted descriptions. Phase I of the project will examine the feasibility of this approach using a limited number of talkers and sentences. The product that will be developed will be a component of a speech synthesizer that is capable of automatically modifying the rules of the synthesizer so that it wounds like any chosen talker. The system will have important application for laryngectomy and other speech- impaired patients who want synthetic speech that is close to their normal voice. The system will also have applications in speech pathology since it can characterize deviant voices. The system will have applications in speech recognition and talker verification as well. PROPOSED COMMERCIAL APPLICATION Potential commercial application not available
Keywords: biomedical equipment development, clinical biomedical equipment, speech recognition, speech synthesizer human subject
Project start date: 1998-04-01
Project end date: 1998-12-31
1R43DC003740-01 (1998): $100000
Sponsored Links Excellgen http://Excellgen.com
Grants awarded to Richard S Mcgowan
Production Modeling For Articulatory Recovery
Richard S Mcgowan, Manager
Cress, Llc 1 Seaborn Pl Lexington, Ma 02420
Grant 3R01DC001247-12S1 from National Institute On Deafness And Other Communication Disorders IRG: ZRG1
Abstract: The speech production model developed in this work will serve as an internal model in an analysis-by-synthesis algorithm intended to recover articulatory movement from speech acoustics. The speech production model will have three components 1) an improved version of the Haskins articulatory synthesizer, ASY, 2) an improved task-dynamic model, and 3) an inverse normalizing map, which relates ASY vocal tract shapes to human vocal tract shapes. In order for ASY and the task-dynamic model to serve as components of a model of a human talker it is necessary that the task-dynamic, with end effectors in ASY, induce the articulatory kinematics observed in the talker as the image of the inverse normalizing map. ASY and the task-dynamic model are parts of a veridical model of human speech production only in conjunction with an inverse normalizing map. The approach here will be to make ASY and the task-dynamic model as realistic as possible without compromising their relative simplicity. This will enable inverse normalizing maps to be mathematically regular functions, which is an important property for articulatory recovery. There have been many empirical studies on vocal tract shape performed since the time ASY was constructed that can be included into ASY. Improvements in the transfer function calculation can also be made at this time. With improvements to ASY will necessarily come improvements in the task-dynamic model, particularly tongue control and control during obstruent production. Various methods for constructing the inverse normalizing map, along with its ability to map articulatory kinematics, will also be tested. With all three components, the task-dynamic model will be tested using X-ray microbeam data of human speech production.
Keywords: computational neuroscience, mathematical model, speech, speech synthesizer, artificial intelligence, speech recognition, X ray, behavioral /social science research tag, computer program /software, human data
Project start date: 1991-09-01
Project end date: 2005-06-30
3R01DC001247-12S1 (2003): $49334
5R01DC001247-12 (2003): $138502
5R01DC001247-11 (2002): $138502
2R01DC001247-10A1 (2001): $137046
RECOVERING ARTICULATORY MOVEMENT FROM SPEECH ACOUSTICS
Richard S Mcgowan, Manager
Haskins Laboratories, Inc.
new Haven, Ct 06511
Grant 5R29DC001247-04 from National Institute On Deafness And Other Communication Disorders IRG: CMS
Abstract: The proposed work will provide mathematical and computational techniques for recovering articular movement from speech acoustics. Such techniques would be of value to experimental speech science, where the coordinated movements of the articulators are often the objects of investigation. Such a procedure can provide articulatory information when there is no movement measurement possible, as is the cases encountered outside the laboratory or with young children. Trajectories of acoustic variables, such as the time course of a formant frequency, will be used to map onto trajectories of articulatory variables, such as the time course of jaw angle. The optimization required for the mapping will be performed once over entire trajectories, corresponding to consonant-vowel sequences. This should make the method efficient, as well as unambiguous. The set of articulatory variables will include aerodynamic and laryngeal variables, such as subglottal pressure and vocal fold tension. These variables will help to provide detailed information during obstruent production, and in transitions to or from obstruents. Least-squares optimization procedures based on relations between the articulation and the resulting sound output will be used
Keywords: mathematical model, speech, statistics /biometry, vocal cord larynx, respiratory airflow measurement
Project start date: 1991-09-01
Project end date: 1996-08-31
5R29DC001247-04 (1994): $94019
Articulatory Recovery From Speech Acoustics
Richard S Mcgowan, Manager
Cress, Llc 1 Seaborn Pl Lexington, Ma 02420
Grant 5R01DC001247-14 from National Institute On Deafness And Other Communication Disorders IRG: MFSR
Abstract: The project s goal is to build methods and algorithms for the recovery of articulatory movement from acoustic speech signals. In the immediate future this will be accomplished by employing analysis-by-techniques for sonorant sounds of English, but with a view to extend this to obstruents and the speech of other languages. There are four main areas of research that will be conducted to reach these goals. The first area is to study the relationship between small changes in articulation and the resulting acoustics, which is feasible now that a large amount of simultaneously recorded articulatory movement and acoustic data are available. The second area of research is in the kinematics of articulatory movement, particularly that of the tongue. Flesh point data enables a data-driven approach to the modeling of tongue kinematics. A quantitative approach to the kinematics of line segments between flesh points, secant lines, is being pursued because it is a first approximation to a kinematics of tongue shape, which is more closely related to acoustics than the flesh points themselves. Because the method for articulatory recovery is analysis-by-synthesis, it is necessary to have an articulatory synthesizer for an internal speech production model. The third area of research is the construction of an articulatory synthesizer that includes knowledge gained from the second area of research. With progress already made in articulatory synthesis, it is possible to concentrate on the kinematic control of the synthesizer. This also will be done from a data-driven approach with piecewise polynomials fit to secant line kinematic trajectories derived from flesh point data. Finally, the fourth area of research is in the recovery algorithms themselves, which include methods for normalization between different vocal tracts and the articulatory synthesizer s vocal tract. Both veridical (actual space-time articulatory trajectories are reproduced) and non-veridical (categorical segmental properties are reproduced for perceptually accurate identification) articulatory will be tested. Veridical articulatory recovery from speech acoustics is useful for the laboratory and clinic when acoustic and partial articulatory information are available and the scientist or clinician wants to know more about articulation. Non-veridical articulatory recovery is important for models of speech and language learning.
Keywords: speech, base, biomechanics, human, language, learning, model, morphology, motivation, sound, tongue, training, clinical research
Project start date: 1991-09-01
Project end date: 2011-02-28
5R01DC001247-14 (2007): $194594
2R01DC001247-13A2 (2006): $198938
3R01DC001247-16S1 (2009): $93640
RECOVERING ARTICULATORY MOVEMENT FROM SPEECH ACOUSTICS
Richard S Mcgowan, Manager
Sensimetrics Corporation 48 Grove St, Ste 305n Somerville, Ma 02144
Grant 5R01DC001247-09 from National Institute On Deafness And Other Communication Disorders IRG: CMS
Abstract: Adapted from the Investigator s ) Dr. McGowan has been active in developing the Haskins Laboratories articulatory model for a number of years, with the aim of recovering the articulatory trajectories of the speaker s vocal tract from the acoustic signal. The method involves analyzing the signal to devise a hypothesis of how the vocal tract must have behaved in order to produce that signal, and then resynthesizing the signal using those articulatory parameters. A genetic algorithm is used as an optimization algorithm to recover the speaker s behavior performs fitness proportionate selection, mating and mutation, iterating for 60 generations. The resulting articulatory hypothesis can be compared with articulatory data from the speaker, and the synthesized speech signal can be compared to the original acoustic signal. The system was developed on the basis of formant frequencies of sonorant speech sounds like vowels, since the relationship between articulatory shape and acoustic outcome is particularly well worked out for these acoustic parameters. The proposed work will extend the articulatory model to the more challenging problem of non- sonorant consonants, which include abrupt discontinuities introduced into the signal by tighter constrictions in the vocal tract and more sudden releases of those constrictions. It will also focus on modelling the particular vocal tracts of several speakers in the extensive articulatory database developed by John Westbury and his colleagues using the x-ray microbeam facility at the University of Wisconsin; simultaneous pellet tracks and acoustic recordings are available for a wide variety of consonants and vowels.
Keywords: body movement, mathematical model, speech, vocal cord, behavioral /social science research tag, human data
Project start date: 1991-09-01
Project end date: 2001-08-31
5R01DC001247-09 (1999): $191007
5R01DC001247-08 (1998): $185940
Sponsored Links Excellgen http://Excellgen.com
5R01DC001247-07 (1997): $178789
ARTICULATORY RECOVERY FROM SPEECH ACOUSTICS
Richard S Mcgowan
Cress, Llc, 1 Seaborn Pl, Lexington, Ma 02420
Grant 5R01DC001247-17 from National Institute On Deafness And Other Communication Disorders
Keywords: ASY; Acoustic; Acoustics; Algorithms; American; Anterior; Area; Articulation; Articulators; Cell Communication and Signaling; Cell Signaling; Clinic; Clinical; Code; Coding System; Coupled; Data; Future; Goals; Grant; Human; Human, General; Individual Differences; Intracellular Communication and Signaling; Investigators; Joints; KIAA0886; Knowledge; Laboratories; Language; Language Development; Learning; Left; Man (Taxonomy); Man, Modern; Maps; Methods; Methods and Techniques; Methods, Other; Modeling; Morphology; Movement; NI220/250; NOGO; NOGOA; NOGOB; NOGOC; NSP-CL; Output; Persons; Position; Positioning Attribute; Procedures; Production; Property; Property, LOINC Axis 2; RTN4; RTN4 gene; Recovery; Research; Research Personnel; Researchers; Science; Scientist; Shapes; Signal Transduction; Signal Transduction Systems; Signaling; Sound; Sound - physical agent; Specific qualifier value; Specified; Speech; Speech Acoustic; Speech Acoustics; Speech Development; Techniques; Testing; Time; Tongue; Training; Work; acquiring language skills; base; biological signal transduction; body movement; improved; interest; kinematics; language acquisition; language learning; mathematical algorithm; sound; tool
Project start date: 1991-09-01
Project end date: 2011-02-28
Budget start date: 1-MAR-2010
Budget end date: 28-FEB-2011
5R01DC001247-17 (2010): $194579
RECOVERING ARTICULATORY MOVEMENT FROM SPEECH ACOUSTICS
Richard S Mcgowan, Manager
Sensimetrics Corporation 48 Grove St, Ste 305n Somerville, Ma 02144
Grant 2R01DC001247-06 from National Institute On Deafness And Other Communication Disorders IRG: CMS
Project start date: 1991-09-01
Project end date: 2000-08-31
2R01DC001247-06 (1996): $186361