Vol. 11, No. 1, April 2010
Sandra Gucul-Milojević
University of Belgrade, Faculty of Philology
PERSONAL NAMES IN INFORMATION EXTRACTION
UDC: 004.832.2:025.4
Keywords: personal name, information extraction, electronic text, finite state automata, electronic dictionary, local grammar, computational linguistic
Abstract: The production of electronic texts on the Internet in digital librariesand archives increases every day and the need for adequate software tools that would enable users to manipulate texts and automatically process them increases with it. In the first part of the paper, various definitions of the Information Extraction field, the short history of the development of IE methods, and its different types and possible applications shall be presented. There are various methods of information extraction. Some are simple methods based on pattern matching, and some that use finite-state automata, context-free grammars or statistical models which are rather more complex. In the second part of the paper, the method for the precise automatic string recognition in a Serbian language digital text of a Serbian name and a surname, as well as English names transcribed in Serbian, will be presented and analyzed. Personal names represent an important part of the lexica of written texts regardless of their form, printed or electronic, and they are widely researched in the information extraction field. The method that is described in this work has been developed in LADL (Laboratoire d’Automatique Documentaire et Linguistique).
ARTICLE
Tomaž Erjavec
TEXT ENCODING INITIATIVE GUIDELINES AND THEIR LOCALISATION
Annibale Elia, Simonetta Vietri
LEXIS-GRAMMAR & SEMANTIC WEB
Biljana Kalezić
SOFTWARE PIRACY IN SERBIA
Sandra Gucul-Milojević
PERSONAL NAMES IN INFORMATION EXTRACTION
REVIEWS
Marija Stiković
THE FIRST EUROPEAN SUMMER SCHOOL “CULTURE & TECHNOLOGY”
Adam Sofronijević
AN INSIGTH INTO ETHICS IN SCIENCE AND CULTURE