作者: Craig Silverstein , Sergey Brin , Rajeev Motwani
关键词: Market basket 、 Association rule learning 、 Purchasing 、 Efficient algorithm 、 Data mining 、 Apriori algorithm 、 Synthetic data 、 Mathematics
摘要: One of the more well-studied problems in data mining is search for association rules market basket data. Association are intended to identify patterns type: “A customer purchasing item A often also purchases B.” Motivated partly by goal generalizing beyond and ironing out some definition rules, we develop notion dependence that statistical both presence absence items itemsets. We propose measuring significance via chi-squared test independence from classical statistics. This leads a measure upward-closed itemset lattice, enabling us reduce problem border between dependent independent itemsets lattice. pruning strategies based on closure property thereby devise an efficient algorithm discovering rules. demonstrate our algorithm‘s effectiveness testing it census data, text (wherein seek term dependence), synthetic