CSE542 will cover three main topics, each to be covered over a 4-week
period:
Introduction to the collection and analysis of speech
data for speech processing
Includes a brief introduction to corpus linguistics. Students
will learn the range and types of spoken language collections, and
will learn how to analyze speech data using the Praat tool.
Introduction to speech recognition.
Students will learn basic technologies for speech recognition, using
the Hidden Markov Model Toolkit (HTK).
Introduction to concatenative text-to-speech synthesis
Students will learn the basics of text-to-speech synthesis (TTS),
as well as current technologies for concatenative TTS. The TTS system
Festival (or its Java version, FreeTTS) will be used.
Integration of speech recognition and TTS into other technologies
(by means of, e.g., VoiceXML and/or the speech SDKs under development
by Microsoft, Sun (Java), and IBM) will also be discussed. |