DOI: 10.3233/IDA-140668
关键词: Task (project management) 、 Public domain 、 Test data generation 、 Data mining 、 Semantics 、 Benchmarking 、 Temporal database 、 Noise (video) 、 Computer science 、 Data stream mining
摘要: Frequent episode mining has been proposed as a data task for recovering sequential patterns from temporal sequences and several approaches have introduced over the last fifteen years. These techniques however never compared against each other in large scale comparison, mainly because existing real life is prevented entering public domain by non-disclosure agreements. We perform such comparison first time. To get around problem of proprietary data, we employ generator based on number observations capable generating that mimics at our disposal. Artificial offers additional advantage underlying are known, which typically not case data. Thus, can evaluate time ability to recover embedded noise. Our experiments indicate constraints more important affecting effectiveness than occurrence semantics. They also when phenomena present same rather difficult there need develop better significance measures dealing with sets episodes.