作者: Shrikant Kashyap , Panagiotis Karras
关键词: Pruning (decision trees) 、 Search engine indexing 、 Artificial intelligence 、 Curse of dimensionality 、 Data mining 、 Computer science 、 k-nearest neighbors algorithm 、 Feature vector 、 Index (publishing) 、 Scalability 、 Nearest neighbor search 、 Machine learning
摘要: Nearest-neighbor search over time series has received vast research attention as a basic data mining task. Still, none of the hitherto proposed methods scales well with increasing time-series length. This is due to fact that all provide an one-off pruning capacity only. In particular, traditional utilize index in reduced-dimensionality feature space; however, for high length, such yields many false hits need be eliminated by accessing full records. An attempt reduce indexing more features exacerbates curse dimensionality, and vice versa. A recently alternative, iSAX, uses symbolic approximate representations accessed simple file-system directory index. iSAX also encounters hits, which are again records full: once hit generated index, there no second chance prune it; thus, provides one-off. paper proposes alternative approach kNN search, following nontraditional style. Instead navigating through candidate via we access their features, obtained multi-resolution transform, stepwise sequential-scan manner, one level resolution at time, vertical representation. Most candidates progressively after few terms accessed, using pre-computed information unprecedentedly tight double-bounding scheme, involving not only lower, but upper distance bounds. Our experimental study large, high-length confirms advantage our both current state-of-the-art method, classical index-based methods.