A density-based indexing method for efficient execution of high-dimensional nearest-neighbor queries on large databases

作者： Dan Geiger , Usama Fayyad , Kristin P. Bennett

DOI:

关键词:

摘要: Method and apparatus for efficiently performing nearest neighbor queries on a database of records wherein each record has large number attributes by automatically extracting multidimensional index from the data. The method is based first obtaining statistical model content data in form probability density function. This then used to decide how should be reorganized disk efficient queries. At query time, decides order which scanned. It also provides means evaluating correctness answer found so far partial scan determined model. In this invention clustering process performed produce multiple clusters. Each cluster characterized set clusters represent function mixture A new built having an augmented format that contains original additional attribute containing step. uses augmenting accomplished record's with respect cluster. Once are build as into analysis can very conducted using indexed look up process. As queried, determine or pages when scanning stop because been high probability.

google.com 本地加速

freepatentsonline.com 本地加速

google.com PDF 下载加速

google.co.uk LINK 下载加速

google.com.mx LINK 下载加速

google.com LINK 下载加速

freepatentsonline.com UNKNOWN 下载加速

lens.org UNKNOWN 下载加速

参考文章(26)

E. W. Forgy, Cluster analysis of multivariate data : efficiency versus interpretability of classifications Biometrics. ,vol. 21, pp. 768- 769 ,(1965)

Darcy Kim Rossmo, Expert system and method of performing crime site analysis ,(1998)

Raghu Ramakrishnan, Tian Zhang, Miron Livny, Method and system for data clustering for very large databases ,(1996)

S. Shimoji, S. Lee, Data clustering with entropical scheduling world congress on computational intelligence. ,vol. 4, pp. 2423- 2428 ,(1994) , 10.1109/ICNN.1994.374600

John W. Tukey, Jan O. Pedersen, Method and apparatus for information accesss employing overlapping clusters ,(1996)

Kevin Beyer, Jonathan Goldstein, Raghu Ramakrishnan, Uri Shaft, When Is ''Nearest Neighbor'' Meaningful? international conference on database theory. pp. 217- 235 ,(1999) , 10.1007/3-540-49257-7_15

Stephen I. Gallant, Methods for generating or revising context vectors for a plurality of word stems ,(1991)

Jon M. Kleinberg, Two algorithms for nearest-neighbor search in high dimensions symposium on the theory of computing. pp. 599- 608 ,(1997) , 10.1145/258533.258653

Stefan Berchtold, Daniel A. Keim, High-dimensional index structures database support for next decade's applications (tutorial) Proceedings of the 1998 ACM SIGMOD international conference on Management of data - SIGMOD '98. ,vol. 27, pp. 501- ,(1998) , 10.1145/276304.276353

10.

King-Ip Lin, H. V. Jagadish, Christos Faloutsos, The TV-tree: an index structure for high-dimensional data very large data bases. ,vol. 3, pp. 517- 542 ,(1994) , 10.1007/BF01231606

A density-based indexing method for efficient execution of high-dimensional nearest-neighbor queries on large databases

来源期刊

我的账户

A density-based indexing method for efficient execution of high-dimensional nearest-neighbor queries on large databases

来源期刊

相似文章 10

我的账户