Enrichment of Renaissance Texts with Proper Names
Abstract
The aim of the Renom project was to enrich Renaissance
texts with proper names. These texts present two challenges:
they exhibit great diversity due to various spellings of words and
are overladen with numerous XML-TEI tags introduced to save the
exact format of the original edition. The task consisted of adding
Named Entity tags to this format by tagging names, that had not
been already tagged, and their left, and sometimes right, context
when appropriate. In order to achieve this, we have improved free,
open source program CasSys to parse texts with Unitex graph cascades
and we have built specific dictionaries and cascades. The
evaluation showed that the slot error rate of name tagging was
6.1%. Renaissance texts enriched in this way are used in a website
that unites Humanities and tourism by allowing visitors to navigate
maps with names.