Extraction and annotation of 'location names'
Abstract
Introduced as part of the Message Understanding Conferences dedicated to information extraction, Named Entity extraction is a well-studied task in Natural Language Processing. The recognition and the categorisation of person names, location names, organisation names, etc., is regarded as a fundamental process for a wide variety of natural language processing applications dealing with content analysis and many research works are devoted to it, achieving very good results.
One of our objectives is the identification and automatic (or semi-automatic) annotation of location names in order to apply the most appropriate information extraction methods. Then the main objective concerns the combination and interoperability between symbolic and statistical NLP (Natural Language Processing) methods (symbolic rules, machine learning, and data mining).
Our work consisted of recognising named entities and in particular locations with Unitex, annotating them with Brat, and correcting them manually. The recall and accuracy rates are very encouraging but the question remains: What is a location name ?