Method for generating training data for medical text abbreviation and acronym normalization

作者: Pakhomov , S Sergey

DOI:

关键词:

摘要: A method for electronically generating high-quality feature vectors that can be used in connection with electronic data processing systems implementing Maximum Entropy or other statistical models to accurately normalize abbreviations text such as medical records. An abbreviation database and a training are provided. The includes representative of associated expansions normalized. corpus having the is processed function identify text. Context information describing context which were identified generated. set also stored. Each vector including generated expansion