Extraction of Bilingual Terminology using Graphs, Dictionaries and GIZA++
Abstract
In science, industry and many research fields, terminology is rapidly developing. Most often, a language that is “lingua franca” for most of these areas is English. As a consequence, for many fields, domain terms are conceived in English, and are later translated to other languages. In this paper, we present an approach for automatic bilingual terminology extraction for English-Serbian language pair that relies on an aligned bilingual domain corpus, a terminology extractor for a target language and a tool for chunk alignment. We examine the performance of the method on a Library and Information Science domain. The obtained results, as well as the application that implements the method, are available on-line.
Published
2020-03-16
How to Cite
ŠANDRIH, Branislava; STANKOVIĆ, Ranka.
Extraction of Bilingual Terminology using Graphs, Dictionaries and GIZA++.
Infotheca - Journal for Digital Humanities, [S.l.], v. 19, n. 2, p. 119-138, mar. 2020.
ISSN 2217-9461.
Available at: <https://infoteka.bg.ac.rs/ojs/index.php/Infoteka/article/view/2019.19.2.6_en>. Date accessed: 18 nov. 2024.
doi: https://doi.org/10.18485/infotheca.2019.19.2.6.
Section
Articles