Why the Information Explosion Can Be Bad for Data Mining, and How Data Fusion Provides a Way Out.

作者: Peter van der Putten , Joost N. Kok , Amar Gupta

DOI:

关键词: Database marketingMass mediaData scienceMedia consumptionProduct (business)Market researchComputer scienceData stream miningInformation explosionKnowledge extractionData mining

摘要: One may claim that the exponential growth in amount of data provides great opportunities for mining. Reality can be different though. In many real world applications, number sources over which this information is fragmented grows at an even faster rate, resulting barriers to widespread application mining and missed business opportunities. Let us illustrate paradox with a motivating example from database marketing. marketing, direct forms communication are becoming increasingly popular. Instead broadcasting single message all customers through traditional mass media such as television print, most promising potential receive personalized offers appropriate channels. So it becomes more important gather about consumption, attitudes, product propensity etc. individual level. Basic, company specific customer resides databases, but market survey depicting richer view only available small sample or disjoint set reference customers. Collecting whole source would certainly valuable usually very expensive proposition. The common alternative within

参考文章(12)
Jonathan Jephcott, Timothy Bock, The Application and Validation of Data Fusion International Journal of Market Research. ,vol. 40, pp. 1- 18 ,(1998) , 10.1177/147078539804000301
John O'Brien, Paul Harris, Ken Baker, Data Fusion: An Appraisal and Experimental Evaluation: International Journal of Market Research. ,vol. 39, pp. 1- 52 ,(1997) , 10.1177/147078539703900101
Tao Xiong, V. Cherkassky, A combined SVM and LDA approach for classification international joint conference on neural network. ,vol. 3, pp. 1455- 1459 ,(2005) , 10.1109/IJCNN.2005.1556089
Nancy Ruggles, Richard Ruggles, A Strategy for Merging and Matching Microdata Sets Research Papers in Economics. pp. 353- 371 ,(1974)
Donald B. Rubin, Statistical Matching Using File Concatenation With Adjusted Weights and Multiple Imputations Journal of Business & Economic Statistics. ,vol. 4, pp. 87- 94 ,(1986) , 10.1080/07350015.1986.10509497
D. Nguyen, B. Widrow, Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights 1990 IJCNN International Joint Conference on Neural Networks. ,vol. 1990, pp. 21- 26 ,(1990) , 10.1109/IJCNN.1990.137819
Roderick JA Little, Donald B Rubin, None, Statistical Analysis with Missing Data ,(1987)
Edward C. Budd, THE CREATION OF A MICRODATA FILE FOR ESTIMATING THE SIZE DISTRIBUTION OF INCOME* Review of Income and Wealth. ,vol. 17, pp. 317- 333 ,(1971) , 10.1111/J.1475-4991.1971.TB00785.X
Willard L. Rodgers, An Evaluation of Statistical Matching Journal of Business & Economic Statistics. ,vol. 2, pp. 91- 102 ,(1984) , 10.1080/07350015.1984.10509373