This course gives a foundation in current audio and speech technologies, and covers techniques for sound processing by humans and machines. Topics include fundamentals of signal processing and pattern recognition, acoustics, auditory perception, speech production and synthesis, speech estimation. The course will explore applications of speech and audio processing in human computer interfaces such as speech recognition, speaker identification, coding schemes (e.g. MP3), music analysis, noise reduction.
Instructors:
* Dr. Mounya Elhilali, Barton Hall 307, [email protected]
Office Hours: Tuesdays 1:30-3pm, Thursdays 3-4pm or by Appointment
Teaching Assistant:
Ming Sun
Office: Barton Hall, Rm. 6
Office Hours: Tuesdays 3-6pm and Thursdays 4-6pm
D. O'Shaughnessy (1999) Speech Communications: Human and Machine, Wiley-IEEE Press, 2nd edition.
Additional Recommended Reading (Reserve in library):
B. Gold & N. Morgan (2000) Speech and Audio Signal Processing: Processing and perception of speech and music, Wiley.
T. F. Quatieri (2001) Discrete-Time Speech Signal Processing: Principles and Practice, Prentice Hall Signal Processing Series.
L. Rabiner & R. Schafer (1978) Digital Processing of Speech Signals, Prentice-Hall.
A. V. Oppenheim, R. W. Schafer & J. R. Buck (1999) Discrete-Time Signal Processing, Prentice Hall, 2nd edition.
Grading:
Homeworks 15%
Midterm Exam 25% (Thursday October 16th, 2008)
Projects 25%
Final Exam 35% (Friday, December 19th, 2008)
Homework assignments will be distributed every other Tuesday, due after 2 weeks at the beginning of class.
There will be 2 projects during the semester.
Class Schedule
Date
Topics
9/4
Introduction, Course Info, Signal processing review
9/9
Math & signal processing review
9/11
Math & signal processing review
9/16
Review of MATLAB tools
9/18
Sound acoustics
9/23
Speech production
9/25
Speech production
9/30
Speech perception
10/2
Speech perception
10/7
Speech analysis
10/9
Speech analysis
10/14
Speech analysis
10/16
MIDTERM EXAM
10/21
Speech coding
1st project assigned
10/23
Speech coding
10/28
Speech synthesis
10/30
Speech synthesis
11/4
Automatic speech recognition
11/6
Automatic speech recognition
11/11
Speaker identification
1st project due / 2nd project assigned