作者: Jiří Martínek , Ladislav Lenc , Pavel Král
DOI: 10.1007/978-3-030-01418-6_60
关键词:
摘要: Cross-lingual document representation can be done by training monolingual semantic spaces and then to use bilingual dictionaries with some transform method project word vectors into a unified space. The main goal of this paper consists in evaluation three promising methods on cross-lingual classification task. We also propose, evaluate compare two approaches. popular convolutional neural network (CNN) its performance standard maximum entropy classifier. proposed are evaluated four languages, namely English, German, Spanish Italian from the Reuters corpus. demonstrate that results all transformation close each other, however orthogonal gives generally slightly better when CNN trained embeddings is used. experimental show achieves than further competitive state art.