T R A C K       P A P E R
ISSN:2394-3661 | Crossref DOI | SJIF: 5.138 | PIF: 3.854

International Journal of Engineering and Applied Sciences

(An ISO 9001:2008 Certified Online and Print Journal)

Automatic Multilingual Code-Switching Speech Recognition

( Volume 7 Issue 5,May 2020 ) OPEN ACCESS
Author(s):

Nguyen Tuan Anh, Dang Thi Hien, Nguyen Thi Hang

Keywords:

Code-Switching, ASR, multi-tone languages, multi-lingual models techniques

Abstract:

In this study, an efficient yet accurate end-to-end multilingual Code-Switching Speech Recognition model has developed, allowing direct conversion of raw speech audio signals into text of multiple languages. This single system for multiple language aims to eliminate the use of each model for a language, in order to increase the ability to share features between languages, minimize the latency of hybrid systems and it can be extended to other objects. Unlike the single-language Automatic Speech Recognition (ASR) model that uses coding of characters or words, the multilingual model applies the same encoding to all languages. However, the vocabulary is encoded into a numerical dictionary and partitioned for each language. The single end-to-end system is designed to directly convert multi-lingual raw audio to dictionary of Unicode numbers of words of languages, which is mapped 1:1 into text of the corresponding language. This method allows to expand to an unlimited number of languages, furthermore, it identifies languages automatically without the need for a separate model. This model uses word pieces, as opposed to graphemes, to reduce the modeling unit gap in multiple languages. The proposed network has been validated on Chinese and Vietnamese, demonstrating a significant improvement of accuracy in comparison with other single and multi-lingual models techniques in monosyllabic and multi-tone languages.

Paper Statistics:

Total View : 648 | Downloads : 639 | Page No: 47-53 |

Cite this Article:
Click here to get all Styles of Citation using DOI of the article.