作者: Xiaofan Lin
DOI:
关键词: Weight value 、 Line number 、 Artificial intelligence 、 Information retrieval 、 Line (text file) 、 Natural language processing 、 Computer science
摘要: Method and apparatus for removing lines of extraneous text from a document. Similarities are identified between on each page corresponding selected subset pages. Different weight values associated with different line numbers page, value indicating degree likelihood that contains text. One or more selectively removed as function the similarities