Central and South-European language resources in META-SHARE

Maciej Ogrodniczuk Institute of Computer Science, Polish Academy of Sciences
Radovan Garabík Ľudovít Štúr Institute of Linguistics, Slovak Academy of Sciences
Svetla Koeva Institute for Bulgarian Language, Bulgarian Academy of Sciences
Cvetana Krstev University of Belgrade, Faculty of Philology
Piotr Pęzik University of Łódź
Tibor Pintér Research Institute for Linguistics, Hungarian Academy of Sciences
Adam Przepiórkowski Institute of Computer Science, Polish Academy of Sciences
György Szaszák Dept. of Telecommunications and Media Informatics, Budapest University of Technology and Economics
Marko Tadić University of Zagreb, Faculty of Humanities and Social Sciences
Tamás Váradi Research Institute for Linguistics, Hungarian Academy of Sciences
Duško Vitas University of Belgrade, Faculty of Mathematics

Abstract

The paper intends to give a brief summary of one the most recent efforts on building the pan-European language technology infrastructure: META-NET – a network of Excellence consisting of 54 research centres from 33 countries – and specifically, its Central and South-European participating project: CESAR. One of the major activities of the project is selection of the resources and tools to be collected, validated, standardized, upgraded/extended/cross-lingually aligned and stored in the META-SHARE open resource exchange facility.
The contribution focuses on presenting the repository maintaining the metadata of the selected resources, the methodology and criteria for their selection and a detailed view to the resources and tools delivered by the project in 2011. After highlighting the concepts of META-SHARE metadata model and synchronized network of metadata servers, the article presents the methodology and criteria for the resource selection by calculating point values basing on solid evaluation indicators such as resource availability, quality, and quantity of similar resources available, coverage, maturity, sustainability and adaptability. The META-NET Language White Papers – the series of reports on the state of each European language with respect to language technology is also presented as well as the licensing guidelines put forward by the META-SHARE community, promoting open and free of charge use of data and tools by using standardized and well-defined legal attributions.

2012_1_en_01.pdf

Published

2024-03-04

How to Cite

OGRODNICZUK, Maciej et al. Central and South-European language resources in META-SHARE. Infotheca - Journal for Digital Humanities, [S.l.], v. 13, n. 1, p. 3-26, mar. 2024. ISSN 2217-9461. Available at: <https://infoteka.bg.ac.rs/ojs/index.php/Infoteka/article/view/378>. Date accessed: 21 july 2026.

Citation Formats