作者: Meliha Yetisgen-Yildiz , Wanda Pratt
DOI:
关键词:
摘要: This work explores the effect of text representation techniques on overall performance medical classification. To accomplish this goal, we developed a classification system that supports very basic word (bag-of-words) and more complex phrase (bag-of-phrases). We also combined representations (hybrid) for further analysis. Our extracts phrases from by incorporating knowledge base natural language processing techniques. conducted experiments to evaluate effects different measuring change in with MEDLINE documents OHSUMED dataset. measured information retrieval metrics; precision (p), recall (r), F1-score (F1). In our experiments, achieved better hybrid approach (p=0.87, r=0.46, F1=0.60) compared bag-of-words (p=0.85, r=0.44, F1=0.58) bag-of-phrases r=0.42, F1=0.57).