Personal Names in Information Extraction

Sandra Gucul-Milojević University of Belgrade, Faculty of Philology

Abstract

The production of electronic texts on the Internet in digital libraries and archives increases every day and the need for adequate software tools that would enable users to manipulate texts and automatically process them increases with it. In the first part of the paper, various definitions of the Information Extraction field, the short history of the development of IE methods, and its different types and possible applications shall be presented. There are various methods of information extraction. Some are simple methods based on pattern matching, and some that use finite-state automata, context-free grammars or statistical models which are rather more complex. In the second part of the paper, the method for the precise automatic string recognition in a Serbian language digital text of a Serbian name and a surname, as well as English names transcribed in Serbian, will be presented and analyzed. Personal names represent an important part of the lexica of written texts regardless of their form, printed or electronic, and they are widely researched in the information extraction field. The method that is described in this work has been developed in LADL (Laboratoire d’Automatique Documentaire et Linguistique).

2010_1_en_04.pdf

Published

2024-03-06

How to Cite

GUCUL-MILOJEVIĆ, Sandra. Personal Names in Information Extraction. Infotheca - Journal for Digital Humanities, [S.l.], v. 11, n. 1, p. 53a-63a, mar. 2024. ISSN 2217-9461. Available at: <https://infoteka.bg.ac.rs/ojs/index.php/Infoteka/article/view/452>. Date accessed: 23 july 2026.

Citation Formats

Issue

Vol 11 No 1 (2010): Infotheca - Journal of Informatics and Librarianship

Section

Articles

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

		Faculty of Philology, University of Belgrade
		University Library „Svetozar Marković“
		Association of Libraries of the Universities of Serbia

Personal Names in Information Extraction

Abstract

Publisher