作者: Seth Patinkin
DOI:
关键词:
摘要: The invention provides a method, apparatus and system for classification clustering electronic data streams such as email, images sound files identification, sorting efficient storage. inventive systems disclose labeling document belonging to predefined class though computer methods that comprise the steps of identifying an stream using one or more learning machines comparing outputs from determine label associate with data. method further utilizes in combination hashing schemes cluster classify documents. In embodiment hash apparatuses taxonomize clusters. yet another embodiment, clusters documents utilize geometric contain corpus without overhead search