Course description:

This course gives a foundation in current audio and speech technologies, and covers techniques for sound processing by humans and machines. Topics include fundamentals of signal processing and pattern recognition, acoustics, auditory perception, speech production and synthesis, speech estimation. The course will explore applications of speech and audio processing in human computer interfaces such as speech recognition, speaker identification, coding schemes (e.g. MP3), music analysis, noise reduction.

Instructors:

* Dr. Mounya Elhilali, Barton Hall 307, [email protected]
Office Hours: Tuesdays 1:00-3:00pm or by Appointment
* Dr. Aren Jansen, CSEB 226C, [email protected]
Office Hours: Tuesdays 2:15-4:15 or by Appointment

Teaching Assistant:

Ehsan Jahangiri, [email protected]
Office Hours: Thursdays 2-4pm, CSEB 306.

Place and Time:

Tuesdays / Thursdays: 10:30-11:45am. Barton Hall, Room117

Prerequisites:

Knowledge of Fourier analysis and signal processing (or permission of instructors)

Textbook:

T. Quatieri (2001) Discrete-Time Speech Signal Processing: Principles and Practice, Prentice Hall Handouts

Additional Recommended Reading (Reserve in library):

A. V. Oppenheim, R. W. Schafer & J. R. Buck (1999) Discrete-Time Signal Processing, Prentice Hall, 2nd edition.
O'Shaughnessy (1999) Speech Communications: Human and Machine, Wiley Press, 2nd edition.
Christopher Bishop (2006) Pattern Recognition and Machine Learning, Springer
M. Gales and S. Young (2007) The Application of Hidden Markov Models in Speech Recognition, NOW Publishers (available online from JHU for free)

Grading:

Homeworks 15%
Midterm Exam 25%
Projects 25%
Final Exam 35%
Homework assignments will be distributed every other Tuesday, due after 2 weeks at the beginning of class. There will be 2 projects during the semester.

Important Dates

Midterm exam: Thursday October 14, 2010, 10:30-11:45am in Barton 117
Project 1: Assigned: Thursday October 19, 2010 / Due: Thursday November 9, 2010
Project 2: Assigned: Tuesday November 9, 2010 / Due: Thursday December 2, 2010
Final exam: Friday December 17, 2-5pm in Barton 117.

Class Schedule

Date Topics
8/31 Introduction, Course Info, Signal processing fundamentals
9/2 Signal processing fundamentals
9/7 Signal processing fundamentals
9/9 Signal processing fundamentals
9/14 Speech production
9/16 Speech production
9/21 Speech perception
9/23 Speech perception
9/28 Speech analysis
9/30 Speech analysis
10/5 Speech analysis
10/7 Speech analysis
10/12 NO CLASS
10/14 MIDTERM EXAM
10/19 Machine learning for speech and audio
1st project assigned
10/21 Machine learning for speech and audio
10/26 Automatic speech recognition
10/28 Automatic speech recognition
11/2 Automatic speech recognition
11/4 Automatic speech recognition
11/9 Speaker identification
1st project due / 2nd project assigned
11/11 Speaker identification
11/16 Speech synthesis
11/18 Speech synthesis
11/23 Music analysis
11/25 THANKSGIVING BREAK
11/30 Miscellaneous applications
12/2 Review
2nd project due