Description

This chapter focuses on the normalization of abbreviations and shorthand forms used in French text messages. These forms are difficult to normalize, as they mostly cannot be resolved by typical spell checkers and dictionary lookups. Firstly, we aligned normalized and non-normalized French text messages and built a parallel corpus. We applied two popular approaches for text normalization, namely multilingual word embeddings, and character-based machine translation. We compare our results and obse

About this Ebook

PublisherJohn Benjamins Publishing Company

PublishedSeptember 1, 2025

LanguageEnglish

Pages313

FormatEPUB

Parallel Corpora for Contrastive and Translation Studies

Description

About this Ebook

Explore Related Tags

Parallel Corpora for Contrastive and Translation Studies

Description

About this Ebook

Explore Related Tags

You Might Also Like

0 to 9 and the New York Avant-Garde

1000 Words

1000 Words to Expand Your Vocabulary

1001 Letters For All Occasions