Central and South-European language resources in META-SHARE

  • Maciej Ogrodniczuk Institute of Computer Science, Polish Academy of Sciences
  • Radovan Garabík Ľudovít Štúr Institute of Linguistics, Slovak Academy of Sciences
  • Svetla Koeva Institute for Bulgarian Language, Bulgarian Academy of Sciences
  • Cvetana Krstev University of Belgrade, Faculty of Philology
  • Piotr Pęzik University of Łódź
  • Tibor Pintér Research Institute for Linguistics, Hungarian Academy of Sciences
  • Adam Przepiórkowski Institute of Computer Science, Polish Academy of Sciences
  • György Szaszák Dept. of Telecommunications and Media Informatics, Budapest University of Technology and Economics
  • Marko Tadić University of Zagreb, Faculty of Humanities and Social Sciences
  • Tamás Váradi Research Institute for Linguistics, Hungarian Academy of Sciences
  • Duško Vitas University of Belgrade, Faculty of Mathematics

Abstract

The paper intends to give a brief summary of one the most recent efforts on building the pan-European language technology infrastructure: META-NET – a network of Excellence consisting of 54 research centres from 33 countries – and specifically, its Central and South-European participating project: CESAR. One of the major activities of the project is selection of the resources and tools to be collected, validated, standardized, upgraded/extended/cross-lingually aligned and stored in the META-SHARE open resource exchange facility.
The contribution focuses on presenting the repository maintaining the metadata of the selected resources, the methodology and criteria for their selection and a detailed view to the resources and tools delivered by the project in 2011. After highlighting the concepts of META-SHARE metadata model and synchronized network of metadata servers, the article presents the methodology and criteria for the resource selection by calculating point values basing on solid evaluation indicators such as resource availability, quality, and quantity of similar resources available, coverage, maturity, sustainability and adaptability. The META-NET Language White Papers – the series of reports on the state of each European language with respect to language technology is also presented as well as the licensing guidelines put forward by the META-SHARE community, promoting open and free of charge use of data and tools by using standardized and well-defined legal attributions.

Published
2024-03-04
How to Cite
OGRODNICZUK, Maciej et al. Central and South-European language resources in META-SHARE. Infotheca - Journal for Digital Humanities, [S.l.], v. 13, n. 1, p. 3-26, mar. 2024. ISSN 2217-9461. Available at: <https://infoteka.bg.ac.rs/ojs/index.php/Infoteka/article/view/378>. Date accessed: 21 dec. 2024.