Fifteen writers and their digital imprints in numbers, images and words
Abstract
In this paper we present the corpus \textsc{15authors} which contains 49 works of fifteen authors that wrote in the Serbian language at the end of the 19th and the beginning of the 20th century. This corpus was derived from the SrpELTeC corpus built within the framework of the COST Action ``Distant Reading for European Literary History.'' We used existing annotations (sentences, phrases in foreign languages, part-of-speech (POS) tags, lemmas and named entities) and conducted additional analyses with open-code software Unitex and TXM, in order to reveal digital imprints left by the selected authors in their works.
Keywords: literary corpus, textometry, corpus linguistics, distant reading, Serbian language, Unitex, TXM.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.


