作者: Yoshiharu Ishikawa , Christos Faloutsos , Ravishankar Subramanya
DOI:
关键词: Computer science 、 Image (mathematics) 、 Database 、 Base (topology) 、 Information retrieval 、 Very large database 、 Sample (material)
摘要: Users often can not easily express their queries. For example, in a multimedia/image by content setting, the user might want photographs with sunsets; current systems, like QBIC, has to give sample query, and specify relative importance of color, shape texture. Even worse, correlations between attributes, like, for traditional, medical record database, researcher find “mildly overweight patients”, where implied query would be “weight/height M 4 lb/inch”. Our goal is provide user-friendly, but theoretically solid method, handle such We allow several examples, and, optionally, ‘goodness’ scores, we propose novel method “guess” which attributes are important, what weight. contributions twofold: (a) formalize problem as minimization show how solve optimal solution, completely avoiding ad-hoc heurist Part this work was done while author vising University Maryland Carnegie Mellon University. $ This supported NSF IRI-9625428. Also, National Science Foundation, ARPA NASA under Cooperative Agreement No. IRI-9411299. Permission copy without fee all OT part material granted provided that copies made distributed direct commercial advantage, VLDB copyright notice title publication its date appear, given copying permission Very Large Data Base Endowment. To otherwise, republish, requires and/or special jrom Proceedings 24th Conference New York, USA, 1998 tics past. (b) Moreover, first ‘diagonal’ queries (like ‘overweight’ above). Experiments on synthetic real datasets our estimates quickly accurately ‘hidden’ distance function user’s mind.