作者: Mei Yan , Xiaojie Yang , Weiqiang Hang , Yingcun Xia
DOI: 10.1007/S00477-019-01677-Z
关键词:
摘要: The non-negative matrix factorization has been used in many disciplines of research, where the number factors plays a crucial role. However, fully data-driven method for determining is yet not available literature. Based on fact that most appropriate should generate best prediction, this paper we propose selection using two-step delete-one-out approach, called twice cross-validation. This easy to implement and data-driven. It also works when constraints are imposed including sparsity. Intensive simulations real data analyses suggest proposed performs well cases can select correctly much less than dimension variables sample size reasonably large. As an important application, source apportionment air pollution Singapore, provides physically reasonable profiles.