Leveraging monolingual Hindi-English parallel corpus for translating Hinglish

Hinglish, a hybrid language combining elements of Hindi and English, is considered one of the most commonly spoken code-switched languages. Code-switching, also known as code-mixing, involves the simultaneous usage of grammatical units derived from two or more languages within a single speech utterance. This linguistic phenomenon occurs when speakers incorporate grammatical structures from multiple languages into their discourse. However, the presence of code-switching poses a challenge for Natural Language Processing (NLP) algorithms, which are typically not tailored for handling code-switched data. Consequently, the performance of these algorithms may be compromised when applied to code-switched data. To enhance their effectiveness, additional processing steps such as language identification, normalization, and back-transliteration may be required.

Read More