Vol. 14, No. 2, December 2013

Jelena Graovac
University of Belgrade, Faculty of Mathematics,
Department for Computer Science

WORDNET-BASED SERBIAN TEXT CATEGORIZATION
 

UDC: 811.163.41'322.2
Keywords: Natural Language Text Categorization, Serbian Wordnet, the System of Morphological Dictionaries for Serbian
Abstract: A Serbian text categorization technique, based on the Serbian wordnet is presented. The author is guided by the hypothesis that the inclusion of morphological, syntactic and semantic information contained in lexical resources can improve the process of text documents categorization in Serbian, as one of morphologically rich languages. Ebart-3 corpus is used for driving experiments. It is a collection of newspaper articles in Serbian divided into three categories: Economics, Politics and Sport. The method is based on lists of representative synsets (for each category) from the Serbian wordnet and category assignment function, defined on the basis of these lists. Selection of representative synsets is based on the significance weight measure of a synset for the considered category. Inflection problem in Serbian is solved by means of the system of morphological dictionaries for Serbian. In order to evaluate the presented technique, micro- and macro-averaged Precision,Recall and F1 measures are used. For comparison purpose, another technique based on wordnet-encoded semantic domains is also developed. Instead of well-chosen synsets, representative lists for categories consist of all synsets that belong to semantic domains corresponding to the considered categories. The results show that the technique based on well-chosen synsets outperforms the technique based on semantic domains, although the main reason for enriching wordnet by semantic domains is its even more successful application in natural language processing tasks, especially in text categorization.


 

 


ARTICLES

 

JELENA GRAOVAC
WORDNET-BASED SERBIAN TEXT CATEGORIZATION

MILICA VASIĆ
CHANGE OF NEEDS OF LIBRARY USERS CAUSED BY THE CHANGES OF CONCEPTUALISATION OF EXPERIENCING TIME AND INFORMATION UNDER THE INFLUENCE OF MODERN TECHNOLOGIES

D. RUSSELL BAILEY
CREATING DIGITAL HISTORY - CASE STUDY: THE DORR REBELLION PROJECT 

 

PROFESSONAL PAPERS

 

NEVENA PETROVIĆ, ALEKSANDAR NIKOLIĆ, AGRON BAKIU
PRESENTATION OF THE MULTIMEDIA PROJECT „HOW DID RADIVOJE LOLA ĐUKIĆ AND NOVAK NOVAK(OVIĆ) MAKE US LAUGH?”

 

REVIEWS

 

OLIVERA NASTIĆ 
"FUTURE LIBRARY UNCONFERENCE 2013"ATHENS, 9-10 DECEMBER 2013

ALEKSANDRA ADŽIĆ 
LIBRARY INFORMATION SYSTEM NIBIS – TEN YEARS IN THE SERVICE OF LIBRARIANS