End-to-end models allow us to represent the entire speech recognition pipeline (i.e., conventional acoustic, pronunciation and language models) by one neural network. This has huge implications for privacy, reliability and latency. In this talk, we will discuss the research developments around E2E models that have allowed it to reach the accuracy of conventional models, at a fraction of the model size and reduced latency. In addition, we will present further benefits of these models with respect to multi-lingual speech recognition.
For more info, visit our page:
#SAIT(Samsung Advanced Institute of Technology): http://smsng.co/sait