作者: Haijun Zhang , Tommy W.S. Chow
DOI: 10.1016/J.ESWA.2011.08.128
关键词:
摘要: Highlights? We propose a multi-level-structured representation to express more semantic information of document. ? A multi-level matching method incorporate with EMD distance solved by linear programming is introduced. hybrid similarity including the global and local used enhance retrieval accuracy. Experimental results corroborate that our proposed works well for lengthy documents. Our two-step system can serve as general computationally efficient solution DR. This paper presents document (DR) using similarity. Documents are represented structure level paragraph level. designed model underlying semantics in flexible accurate way conventional flat term histograms find it hard cope with. The between documents then transformed into an optimization problem Earth Mover's Distance (EMD). synthesize improve In this paper, we have performed extensive experimental study verification. suggest evident spatial distributions terms.