作者: Norazian Mohamed Noor , A.S. Yahaya , N.A. Ramli , Mohd Mustafa Al Bakri Abdullah
DOI: 10.4028/WWW.SCIENTIFIC.NET/AMM.754-755.923
关键词: Mean absolute error 、 Mean squared error 、 Nearest neighbour 、 Mathematics 、 Imputation (statistics) 、 Missing data 、 Statistics 、 Linear interpolation
摘要: Hourly measured PM10 concentration at eight monitoring stations within peninsular Malaysia in 2006 was used to conduct the simulated missing data. The gap lengths of values are limited 12 hours since actual trend missingness is considered short. Two percentages gaps were generated that 5 % and 15 %. A number single imputation methods (linear interpolation (LI), nearest neighbour (NN), mean above below (MAB), daily (DM), 12-hour (12M), 6-hour (6M), row (RM) previous year (PY)) calculated fill In addition, multiple (MI) also conducted compare between methods. performances evaluated using four statistical criteria namely absolute error, root squared prediction accuracy index agreement. results show 6M perform comparably well LI. Thus, this effect smaller averaging time gives better prediction. Other predict data except for PY. RM MI performs moderately with increasing performance higher fraction whereas LR makes worst both percentages.