6th International Conference on Natural Language
and Speech Processing
December 16-17, 2023
Alex Waibel
Alexander Waibel is Professor of Computer Science at Carnegie Mellon University (USA) and at Karlsruhe Institute of Technology (Germany). He is director of the International Center for Advanced Communication Technologies. Waibel is known for work in AI, Machine Learning, Multimodal Interfaces and Speech Translation Systems. He introduced consecutive and simultaneous speech translation in 1991 and 2005. Waibel proposed early Neural Network learning methods, including the TDNN, the first shift-invariant (“convolutional”) Neural Net (1987) and many multimodal interaction systems. Waibel founded/co-founded more than 10 startups, including Jibbigo, first speech translator on a phone (acquired by Facebook 2013), and Kites, simultaneous translation services (acquired by Zoom 2021). Waibel is a member of the National Academy of Sciences of Germany, Fellow of the IEEE and of ISCA, and Research Fellow at Zoom. He holds BS/MS/PhD degrees from MIT and CMU
Keynote title: Transcending Communication Barriers: From Machine Translation to Language Transparence
Abstract
As we marvel at impressive advances in Artificial Intelligence in recent years, we may wonder whether the problem of language translation and language barriers has been solved. Aside from remaining technical issues, it is important to note that translation is only one (even though important) step toward making people on the planet understand each other: Our thoughts are expressed in many ways: speech, text, video, handwriting, road signs, facial expressions, voice, lip movement, emotion, gesture, mannerisms and more… For frictionless communication, the way technology is deployed in different settings is just is as important a consideration as the performance of the technology itself and they come with profound consequences on the technical design and requirements. To make language barriers fade into the background, we need language transparence, not only translation: multimodal, immersive, cross lingual, culturally aware, proactive communication and dubbing tools that interpret the communicative intent and transcend barriers between us. In this talk, I will review major milestones on our journey and discuss our latest advances and activities toward this goal.