Recognition: unknown
Performant Tridiagonal Factorization of Skew-Symmetric Matrices
read the original abstract
The factorization of skew-symmetric matrices is a critically understudied area of dense linear algebra, particularly in comparison to that of general and symmetric matrices. While some algorithms can be adapted from the symmetric case, the cost of algorithms can be reduced by exploiting skew-symmetry. This work examines the factorization of a skew-symmetric matrix $X$ into its $LTL^T$ decomposition, where $L$ is unit lower triangular and $T$ is tridiagonal. This is also known as a triangular tridiagonalization. This operation is a means for computing the determinant of $X$ as the square of the (cheaply-computed) Pfaffian of the skew-symmetric tridiagonal matrix $T$ as well as for solving systems of equations, across fields such as quantum electronic structure and machine learning. Its application also often requires pivoting in order to improve numerical stability. We compare and contrast previously-published algorithms with those systematically derived using the FLAME methodology. Performant parallel CPU implementations are achieved by fusing operations at multiple levels in order to reduce memory traffic overhead. A key factor is the employment of new capabilities of the BLAS-like Library Instantion Software (BLIS) framework, which now supports casting level-2 and level-3 BLAS-like operations by leveraging its gemm and other kernels, hierarchical parallelism, and cache blocking. A prototype, concise C++ API facilitates the translation of correct-by-construction algorithms into correct code. Experiments verify that the resulting implementations greatly exceed the performance of previous work.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
A Proposed Framework for Advanced (Multi)Linear Infrastructure in Engineering and Science (FAMLIES)
Proposes FAMLIES as a new flexible framework that vertically integrates linear and multi-linear algebra software stacks for high-performance computing on diverse architectures.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.