Ixa taldea

In 2019, the researcher Olatz Perez de Viñaspre Garralda received the Koldo Mitxelena prize, organized by the Basque Language Academy and the Universi

Gehiago irakurri Euskarazko Tesien VI. Koldo Mitxelena saria -ri buruz

best paper award of the CoNLL 2018 conference

"Uncovering divergent linguistic information in word embeddings with lessons for intrinsic and extrinsic evaluation" artikuluak CoNLL 2018 kongresuko

Gehiago irakurribest paper award of the CoNLL 2018 conference -ri buruz

Neurona-sareetan oinarritutako euskararako korreferentzia-ebazpena

Lan honek euskararako korreferentzia-ebazpenean egindako lanari jarraipena ematea du helburu, korreferentzia-ebazpenerako neurona-sareetan oinarritutako sistema bat eraikiz. Horretarako polonierarako eraikitako sistema bat hartu da abiapuntutzat, eta euskarara egokitu. EPEC-KORREF corpusetik abiatuta, aipamen-bikoteak eta hauen ezaugarriak erauzi dira eta neurona-sarea entrenatu da aipamen-bikoteak korreferenteak ote diren erabakitzeko. Jarraian, neurona-sarearen iragarpenetatik korreferentzia-klusterrak sortu eta ebaluatu egin dira.

Gehiago irakurriNeurona-sareetan oinarritutako euskararako korreferentzia-ebazpena -ri buruz

Adapting NMT to caption translation in Wikimedia Commons for low-resource languages

This paper presents a successful domain adaptation of a general neural machine translation (NMT) system using a bilingual corpus created with captions for images in Wiki- media Commons for the Spanish-Basque and English-Irish pairs. Keywords: Machine Translation, Low-resource languages, Bilingual corpora, Language resources from Wikipedia

Gehiago irakurriAdapting NMT to caption translation in Wikimedia Commons for low-resource languages -ri buruz

Interpretable Deep Learning to Map Diagnostic Texts to ICD10 Codes

Background Automatic extraction of morbid disease or conditions contained in Death Certificates is a critical process, useful for billing, epidemiological studies and comparison across countries. The fact that these clinical documents are written in regular natural language makes the automatic coding process difficult because, often, spontaneous terms diverge strongly from standard reference terminology such as the International Classification of Diseases (ICD). Objective