Segmented principal component transform–principal component analysis

作者: António S. Barros , Douglas N. Rutledge

DOI: 10.1016/J.CHEMOLAB.2005.01.003

关键词: Matrix (mathematics)Pattern recognitionSparse PCAData matrix (multivariate statistics)Computer scienceKernel principal component analysisPrincipal component analysisExtension (predicate logic)Decomposition (computer science)Artificial intelligence

摘要: Abstract A new approach to perform Principal Component Analysis (PCA) on very wide matrices is proposed in this work. The procedure based an extension of the Transform (PCT) concept—the PCT being applied non-superimposed segments data matrix. It shown that method uses less memory than classical global PCA since decomposition done much smaller matrices, which has important impact requirements. also Segmented PCT-PCA (SegPCT-PCA) yields same results as performed by a PCA. This will allow study sets (e.g. 2D-NMR), were difficult do using approach. implementation SegPCT-PCA straightforward. An advantage it not necessary read complete matrix into main memory, could be for parallel calculations and cross-validation purposes.

参考文章(18)
F Vogt, M Tacke, Fast principal component analysis of large data sets Chemometrics and Intelligent Laboratory Systems. ,vol. 59, pp. 1- 18 ,(2001) , 10.1016/S0169-7439(01)00130-7
G. W. Stewart, Addendum to A Krylov--Schur Algorithm for Large Eigenproblems SIAM Journal on Matrix Analysis and Applications. ,vol. 24, pp. 599- 601 ,(2002) , 10.1137/S0895479802403150
W. Wu, D.L. Massart, S. de Jong, The kernel PCA algorithms for wide data. Part I: Theory and algorithms Chemometrics and Intelligent Laboratory Systems. ,vol. 36, pp. 165- 172 ,(1997) , 10.1016/S0169-7439(97)00010-5
Edmund R. Malinowski, Factor Analysis in Chemistry ,(1980)
John H. Kalivas, Two data sets of near infrared spectra Chemometrics and Intelligent Laboratory Systems. ,vol. 37, pp. 255- 259 ,(1997) , 10.1016/S0169-7439(97)00038-5
F. Vogt, M. Tacke, Fast principal component analysis of large data sets based on information extraction Journal of Chemometrics. ,vol. 16, pp. 562- 575 ,(2002) , 10.1002/CEM.751
M PARTRIDGE, R CALVO, Fast Dimensionality Reduction and Simple PCA intelligent data analysis. ,vol. 2, pp. 203- 214 ,(1998) , 10.1016/S1088-467X(98)00024-9
António S. Barros, Douglas N. Rutledge, Principal components transform-partial least squares: a novel method to accelerate cross-validation in PLS regression Chemometrics and Intelligent Laboratory Systems. ,vol. 73, pp. 245- 255 ,(2004) , 10.1016/J.CHEMOLAB.2004.03.007