Selecting Backtranslated Data from Multiple Sources for Improved Neural Machine Translation

Machine translation (MT) has benefited from using synthetic training data originating from translating monolingual corpora, a technique known as backtranslation. Combining backtranslated data from different sources has led to better results than when using such data in isolation. In this work we analyse the impact that data translated with rule-based, phrase-based statistical and neural MT systems has on new MT systems.

Contextualized Translations of Phrasal Verbs with Distributional Compositional Semantics and Monolingual Corpora

This article describes a compositional distributional method to generate contextualized senses of words and identify their appropriate translations in the target language using monolingual corpora. Word translation is modeled in the same way as contextualization of word meaning, but in a bilingual vector space. The contextualization of meaning is carried out by means of distributional composition within a structured vector space with syntactic dependencies, and the bilingual space is created by means of transfer rules and a bilingual dictionary.

Dialogo-sistemen artearen egoera

Proiektu honetan euskarazko dialogo-sistemak aurrera egiteko egun dagoen teknologiaren azterketa bat egingo da.

Language In The Human-Machine Era (LITHME). COST Action number: CA19102.

LITHMEk bi helburu ditu: hizkuntzalaritza eta haren azpidiziplinak datorrenerako prestatzea; eta hizkuntzalarien eta teknologia-garatzaileen arteko epe luzeko elkarrizketa erraztea. Nola eragingo dio hizkuntzari teknologien gailentzeak nazioarteko zuzenbidean, itzulpenean eta beste lan linguistiko batzuetan? LITHMEk giza eta makina-arorako hizkuntzalariak eta interes-taldeak prestatzea du xede.

Pages

Subscribe to Ixa taldea RSS