作者: Tara Safavi , Chandra Sripada , Danai Koutra
DOI: 10.1007/S10115-018-1293-8
关键词:
摘要: Discovering and analyzing networks from non-network data is a task with applications in fields as diverse neuroscience, genomics, climate science, economics, more. In domains where are discovered on multiple time series, the most common approach to compute measures of association or similarity between all pairs series. The nodes resultant network correspond which linked by edges weighted according scores their endpoints. Finally, fully connected thresholded such that only stronger weights remain desired sparsity level achieved. While this feasible for small datasets, its quadratic (or higher) complexity does not scale individual series length number compared increase. Thus, circumvent inefficient wasteful intermediary step building graph before sparsification, we propose fast discovery based probabilistic hashing. Our methods emphasize consecutiveness, intuition following similar fluctuations longer time-consecutive intervals more overall. Evaluation real shows our method can build graphs nearly 15 times faster than baselines (when do run out memory), while achieving accuracy comparable to, better than, task-based evaluation. Furthermore, proposals general, modular, may be applied variety sequence search tasks.