Taggers Applied on Texts in Serbian

Zoran Popović Hemofarm, STADA

Abstract

This paper provides a comparative overview of existing language tools based on taggers and machine learning methods, with practical tests and results about different taggers applied on texts in Serbian. For that purpose some already prepared annotated corpora were used, and 10-fold cross validation was used as the testing framework with a specially devised and developed environment of automated testing based on unix scripting (bash, perl, awk) – TnT has shown best performance, while Tree Tagger and SVMTool taggers have shown somewhat better performance in special cases. A possibility of combining different tagging methods and tools (programs) and integration with other NLP environments opens a wide area for further investigations and experiments about these solutions.

2010_2_en_02.pdf

Published

2024-03-07

How to Cite

POPOVIĆ, Zoran. Taggers Applied on Texts in Serbian. Infotheca - Journal for Digital Humanities, [S.l.], v. 11, n. 2, p. 21-38, mar. 2024. ISSN 2217-9461. Available at: <https://infoteka.bg.ac.rs/ojs/index.php/Infoteka/article/view/464>. Date accessed: 29 july 2026.

Citation Formats

Issue

Vol 11 No 2 (2010): Infotheca - Journal of Informatics and Librarianship

Section

Articles

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

		Faculty of Philology, University of Belgrade
		University Library „Svetozar Marković“
		Association of Libraries of the Universities of Serbia

Taggers Applied on Texts in Serbian

Abstract

Publisher