6th International Conference on Natural Language 

and Speech Processing

December 16-17, 2023

Prof. Dr. Alex Waibel, Carnegie Mellon University, USA

Alex Waibel

Alexander Waibel is Professor of Computer Science at Carnegie Mellon University (USA) and at Karlsruhe Institute of Technology (Germany). He is director of the International Center for Advanced Communication Technologies.  Waibel is known for work in AI, Machine Learning, Multimodal Interfaces and Speech Translation Systems.  He introduced consecutive and simultaneous speech translation in 1991 and 2005.  Waibel proposed early Neural Network learning methods, including the TDNN, the first shift-invariant (“convolutional”) Neural Net (1987) and many multimodal interaction systems.  Waibel founded/co-founded more than 10 startups, including Jibbigo, first speech translator on a phone (acquired by Facebook 2013), and Kites, simultaneous translation services (acquired by Zoom 2021).  Waibel is a member of the National Academy of Sciences of Germany, Fellow of the IEEE and of ISCA, and Research Fellow at Zoom.  He holds BS/MS/PhD degrees from MIT and CMU

Keynote title: Transcending Communication Barriers:  From Machine Translation to Language Transparence


As we marvel at impressive advances in Artificial Intelligence in recent years, we may wonder whether the problem of language translation and language barriers has been solved. Aside from remaining technical issues, it is important to note that translation is only one (even though important) step toward making people on the planet understand each other:  Our thoughts are expressed in many ways: speech, text, video, handwriting, road signs, facial expressions, voice, lip movement, emotion, gesture, mannerisms and more… For frictionless communication, the way technology is deployed in different settings is just is as important a consideration as the performance of the technology itself and they come with profound consequences on the technical design and requirements. To make language barriers fade into the background, we need language transparence, not only translation: multimodal, immersive, cross lingual, culturally aware, proactive communication and dubbing tools that interpret the communicative intent and transcend barriers between us. In this talk, I will review major milestones on our journey and discuss our latest advances and activities toward this goal.

Prof. Najim Dehak, Johns Hopkins University, USA

Najim Dehak

An expert in machine learning and speech processing/speaker identification, Najim Dehak is internationally known as the lead developer of I-vector, a factor analysis-based speaker recognition technique.  His research focuses on speech processing and modeling, audio segmentation, speaker, language, and emotion recognition. One of his interests has been building robust emotion detection systems- that can be useful in several areas, including call centers, mental health, and social applications. He is also currently interested in working on topics related to human aging. In this topic, Dr. Dehak and his team are developing non-invasive, artificial intelligence-based tools to detect, assess, and monitor the functional and cognitive decline of elderly adults.

Keynote title: Biosignal-based Digital Biomarkers for Aging



Currently, there are more Americans aged 65 and older (over 49 million) than at any other time in history, according to the US Census Bureau. A significant increase in individuals with severe chronic conditions will have profound social and economic effects on society. Three aspects describe the human aging process: functional (motor system), cognitive, and behavior (social and psychological stressors). In this talk, we will describe several tools to detect, assess, and monitor the functional and cognitive decline of elderly adults. Those tools named biomarkers are based on multimodal biosignals such as speech, handwriting, and eye movement. In addition, we will describe our current work on emotion recognition from speech that can be used to assess social and psychological stressors.