Parallel Corpora for Contrastive and Translation Studies

by Irene Doval

★★★★★
4.5 (612)

US$74.50

15% OFF CODE: SAVE15

Description

This chapter focuses on the normalization of abbreviations and shorthand forms used in French text messages. These forms are difficult to normalize, as they mostly cannot be resolved by typical spell checkers and dictionary lookups. Firstly, we aligned normalized and non-normalized French text messages and built a parallel corpus. We applied two popular approaches for text normalization, namely multilingual word embeddings, and character-based machine translation. We compare our results and obse