AI research at Ava for empowering Deaf & hard-of-hearing with the best live captions
By Alexey Ozerov
Alexey Ozerov, Lead of AI Research at Ava, will give a talk about his work and strategies at Ava company which aim to develop live captioning solution for any situation:
Abstract
In Ava our mission is to empower Deaf & hard-of-hearing people and inclusive organizations with the best live captioning solution for any situation. Moreover, we pay very special attention to captioning of group meetings, which are nowadays hard to tackle by alternative tools. Our solution is cross-platform, allows sub-titling any situation (in-person, remote or hybrid meetings) and does not require any integration in visio-conferencing systems. To facilitate group meeting cases, Ava allows connecting several devices (e.g., smartphones or laptops) into the same session. On one hand, this simplifies captions visualisation by simply adding more displays. On another hand, this creates an ad-hoc microphone array that may improve captioning itself thanks to a potentially better processing of the recorded audio. Finally, for Deaf & hard-of-hearing users just providing accurate captions is not enough. In a multi-talker environment, it is also very crucial to indicate “who is saying what?”. We have developed an on-line speaker diarization approach to solve this problem. To yet improve Ava, our AI team in Paris is working on speaker diarization and various speech cleaning algorithms such as echo cancellation, source separation and speech denoising. In this talk I will present Ava, will demo the application, will talk about our speaker diarization solution, and will present other research topics we are working on.
Biography
Alexey Ozerov is the Lead of AI Research at Ava. He holds a Ph.D. (2006) and HDR (habilitation à diriger des recherches) (2019) in Signal Processing from the University of Rennes 1 (France). Alexey worked towards his Ph.D. degree from 2003 to 2006 in the labs of France Telecom R&D and in collaboration with the IRISA institute. Earlier, he received an M.Sc. degree in Mathematics from the Saint-Petersburg State University (Russia) in 1999 and an M.Sc. degree in Applied Mathematics from the University of Bordeaux 1 (France) in 2003. From 1999 to 2002, Alexey worked at Terayon Communicational Systems (USA) as a R&D software engineer, first in Saint-Petersburg and then in Prague (Czech Republic). He was for one year (2007) in Sound and Image Processing Lab at KTH (Royal Institute of Technology), Stockholm, Sweden, for one year and half (2008-2009) in TELECOM ParisTech / CNRS LTCI - Signal and Image Processing (TSI) Department, and for two years (2009 - 2011) with METISS team of IRISA / INRIA - Rennes. After having spent 8 years with Technicolor Research & Innovation (2011 - 2019), where he was promoted to a Senior Scientist grade (2014) and elected as a Distinguished member of the Technicolor Fellowship Network (2014), he continued for 2. 5 years with InterDigital at Rennes, France. He is now the Lead of AI Research at Ava. He was a member of IEEE Signal Processing Society Audio and Acoustic Signal Processing Technical Committee (2015-2020) and associate editor of IEEE/ACM Transactions on Audio, Speech, and Language Processing journal (2017-2021). He is a Senior member of IEEE (since 2017) and received the IEEE Signal Processing Society Best Paper Award in 2014. His research interests incude various machine learning methods (e.g., deep learning and probabilistic models) for audio and image/video processing and analysis.
More on the speaker’s website.