作者: Shaoxu Song , Aoqian Zhang , Lei Chen , Jianmin Wang
关键词:
摘要: Incomplete information often occur along with many database applications, e.g., in data integration, cleaning or exchange. The idea of imputation is to fill the missing values its neighbors who share same information. Such could either be identified certainly by editing rules statistically relational dependency networks. Unfortunately, owing sparsity, number (identified w.r.t. value equality) rather limited, especially presence variances. In this paper, we argue extensively enrich similarity tolerance small variations. More fillings can thus acquired that aforesaid equality fail reveal. To more, study problem maximizing imputation. Our major contributions include (1) np-hardness analysis on solving and approximating problem, (2) exact algorithms for tackling (3) efficient approximation performance guarantees. Experiments real synthetic sets demonstrate filling accuracy improved.