作者: Girish Palshikar , Sangameshwar Patil , Sachin Pawar
DOI:
关键词:
摘要: Named Entity extraction (NEX) problem consists of automatically constructing a gazette containing instances for each NE interest. NEX is important domains which lack corpus with tagged NEs. In this paper, we propose new unsupervised (bootstrapping) technique, based on variant the Multiword Expression Distance (MED) (Bu et al., 2010) and information distance (Bennett 1998). Ecacy our method shown using comparison BASILISK PMI in agriculture domain. Our discovered 8 diseases are not found Wikipedia.