Evaluate multi-lingual festText embeddings {fr,it,hu} -> en

Another (rather simple) solution to build a shared vector space is to map each language resp. the embeddings from a language to a common target language embedding space, e.g. map the vectors from Italian (French, or Hungarian) to an english embedding space or to an shared Italian-English embedding space. For the latter we could utilize the multi-language fastText embeddings (https://github.com/facebookresearch/MUSE).

Here it would be interesting to perform cross-lingual experiments, e.g. learn on the data sets from two languages and test on another one.

Edited Aug 06, 2020 by Mario Saenger