Extraction of Bilingual Terminology using Graphs, Dictionaries and GIZA++

  • Branislava Šandrih University of Belgrade, Faculty of Philology
  • Ranka Stanković University of Belgrade, Faculty of Mining and Geology

Abstract

In science, industry and many research fields, terminology is rapidly developing. Most often, a language that is “lingua franca” for most of these areas is English. As a consequence, for many fields, domain terms are conceived in English, and are later translated to other languages. In this paper, we present an approach for automatic bilingual terminology extraction for English-Serbian language pair that relies on an aligned bilingual domain corpus, a terminology extractor for a target language and a tool for chunk alignment. We examine the performance of the method on a Library and Information Science domain. The obtained results, as well as the application that implements the method, are available on-line.

Published
2020-03-16
How to Cite
ŠANDRIH, Branislava; STANKOVIĆ, Ranka. Extraction of Bilingual Terminology using Graphs, Dictionaries and GIZA++. Infotheca - Journal for Digital Humanities, [S.l.], v. 19, n. 2, p. 119-138, mar. 2020. ISSN 2217-9461. Available at: <https://infoteka.bg.ac.rs/ojs/index.php/Infoteka/article/view/2019.19.2.6_en>. Date accessed: 15 aug. 2020. doi: https://doi.org/10.18485/infotheca.2019.19.2.6.