作者: Carmen Brando , Catherine Dominguès , Magali Capeyron
关键词:
摘要: Ongoing initiatives promoted by cultural institutions and public administrations engage in the development of textual corpora issued from general public. In this work, we deal with a spoken corpus life stories crowd-sourced Web people's contributions related to urban planning issues their city. Located information constitutes an essential component these corpora. Toponyms refer official names (e.g. Congo) which are listed gazetteers but often generic locations such as un endroit tres beau (a beautiful place). Because nature corpora, inherently subjective, vague descriptive. For enabling automated exploitation texts, it is crucial properly detect kinds place mentions. sense, present work provides comparative study state-of-art NER1 systems, most importantly supervised tools Stanford NER, for identification thematic