Solve systems of equations where the coefficient matrix is sparse.
Sparse Solvers in the Accelerate framework handle the solution of systems of equations where the coefficient matrix is sparse, that is, most of the entries in the matrix are zero.
Many problems in science and technology require the solution of large systems of simultaneous equations. When these equations are linear, they are normally represented as the matrix equation Ax = b (and even when the equations are nonlinear, the problem is often solved as a sequence of linear approximations).
For more information, see WWDC 2017 - Session 711 (Accelerate and Sparse Solvers).
Solution Approaches Offered by Accelerate
Accelerate offers two different solution approaches:
Direct methods perform a factorization such as Cholesky (A = LLᵀ) or QR. These methods provide a fast and accurate black-box solution.
Iterative methods find an approximate solution requiring only repeated multiplication by A or Aᵀ. Although they require less memory than direct methods, and can be faster for very large problems, they typically require problem-specific preconditioners to be effective.
Just as Sparse BLAS provides a sparse counterpart to dense BLAS, Sparse Solvers in the Accelerate framework provide a sparse counterpart to the dense factorizations provided by LAPACK.
In the simplest case, a matrix A is stored as a 2D dense array of floating point values, and routines from BLAS, LAPACK, and other similar libraries are used to manipulate matrices and solve equations. However, such algorithms normally require O(n²) data and O(n³) operations and, as a result, scaling to a large n is prohibitive.
The key to avoiding the expense of these algorithms is taking advantage of the fact that in many real-world applications, most of the entries in A are zero. Such matrices are said to be sparse (conversely, nonsparse matrices are called dense).
These zeros arise naturally in these types of situations:
Incomplete data sets (for example, each user buys only a small fraction of the products from a retailer)
Limited connectivity (for example, most people on social networks are connected only to a tiny proportion of the entire user base)
Physical properties (for example, points on structural meshes only connect to locally adjacent points).
By exploiting these zero entries, you can often reduce the storage and computational requirements to O(tau*n) and O(tau*n²) respectively, where tau is the average number of entries in each column. This reduction makes the solution of large problems (n in the millions or larger) tractable on most computers.
For example, the sparse benchmark matrix ldoor, which arises from structural modeling, has 952,200 x 952,200 entries with an average of 25 nonzero
Double entries per column. Table 1 shows the number of floating point operations (1 Tflop is 10¹² floating point operations) and the memory required to perform Cholesky factorization on that matrix.
Floating point operations