DOI:
关键词: Positive-definite matrix 、 Sparse matrix 、 Matrix (mathematics) 、 Algorithm 、 Computer science 、 Eigendecomposition of a matrix 、 Mathematical optimization 、 Single-entry matrix 、 Cuthill–McKee algorithm 、 Numerical linear algebra 、 Eigenvalues and eigenvectors
摘要: A range of numerical linear algebra problems that arise in finite element-based structural analysis are considered. These were encountered when implementing the element method software package Oasys GSA. We present novel solutions to these form a new for error detection, algorithms with superior efficiency and scalable performance on parallel computers. The their corresponding implementations have been integrated into GSA's program code we results demonstrate use by engineers solve real-world problems. We start introducing detecting sources ill conditioning stiffness matrices. The matrix can become ill-conditioned due variety errors including user errors, i.e., definition model using package. We identify two classes develop called stability where arise. Our debugging models uses properties eigenvectors associated smallest largest eigenvalues parts badly defined. Through examples, real-life show upon fixing identified method, condition number matrices typically drops from $O(10^{16})$ $O(10^8)$. In second part introduce symmetric definite generalized eigenproblem analysis. The eigenvalue problem is $Kx = \lambda Mx$ positive-definite, sparse $K$ semidefinite mass $M$. %We seek closest 0 this problem. We analyse existing solution algorithm, which subspace iteration method. We improve algorithm accelerating convergence via shift strategies reducing floating point operations selective re-orthogonalization locking converged eigenvectors. The implementation also benefits improvements such as faster more robust libraries during iteration. We then an solving $K x x$ eigenvectors; arises analysis. In next tackle main bottleneck computations our eigensolvers: matrix-multiple vector multiplication kernel. We obtain has high computational throughput operation $Y AX$ square $A$ conforming skinny, dense $X$ which, turn, depends underlying storage format $A$. Current formats bandwidth requirements poor reuse cache register loaded entries, restricts performance. We propose mapped blocked row format: bitmapped stores entries blocks without fill overhead, thereby offering blocking additional overheads. An efficient decodes bitmaps de Bruijn sequences minimizes conditionals evaluated. Performance compared popular formats, vendor BLAS. Our multiple-vector achieves all platforms implemented platform-neutral optimizations. The last thesis deals $B$-orthogonalization problem, positive $B \in \R^{\nbyn}$ tall skinny $X \R^{\nbyl}$, wish rotate columns $X^TBX I$. This engineer wishes orthonormalize they orthogonal respect matrix. Conventionally literature Gram-Schmidt like methods employed \bo ization but efficiency. recall Cholesky factorization inner product $X^TBX$ derive stable increases scalability through cache-reuse numerically stable. Our experiments orders magnitude improvement family methods.