Best way of solving linear equations

Hi there,

I have a program where I am doing some weighted least squares, and for that I need to solve a system of linear equations.

I have tried doing this following the normal weighted least squares formulas, however this requires finding the inverse of a matrix, something which is causing me problems. The issue is that if the values of the matrix that needs to be inverted are too big, the inverse matrix will suffer from underflow issues.

I have come to understand that finding the inverse is a bad approach, and that I should rather do QR decomposition, (my design matrix is not square), and then solve through back substitutions.

So is that something that is worth implementing myself? And if so do you have any resources you can point to? I looked a bit at a python package called sympy.

Or is it not worth my time? And I should rather use a library like "lapack" or "eigen", coming at the cost of introducing a dependency for my program.
Just use Eigen. IMO it's incorrect to think of it as a dependency, since you can just bundle it with your code. You can then treat it as if it had been part of your codebase all along.
Q/R costs a modest amount of time to get. See if the library has a better way built in first, then if you need to use its functions to roll out a solution, you can do that if nothing else is working for you, but I can't see that being likely.
What do you mean by "my design matrix is not square"? It would definitely be a bad idea to find the inverse if that were true! QR or LU decomposition should be OK and most matrix libraries will give you those. If your matrix has other features there may be better methods - for example, if it is symmetric you can use Cholesky factorisation, if it is tridiagonal then use the Thomas algorithm, if it is very diagonally dominant then use an iterative method like Gauss-Seidel. So it depends what features your equations have.
I mean that my design matrix X is not square matrix...

However (X^T*W*X) is a square matrix that I am trying to invert.

Good point about the properties of my matrix though, however getting the fastest one is not super important, so I might look for something that just works and has decent speed
do you want the inverse or the solution?
For a general, moderately dense, matrix probably best to use LU decomposition. Eigen has a module for that. Alternatively, you can try the one here:
http://www.cplusplus.com/forum/general/235545/#msg1059501
Hi ,

jonnin:
I want the solution, taking the inverse was just my initial somewhat unseuccesful attempt at this.

lastchance:
Thanks for the suggestion, but my design matrix is not really square, so I do not think I can use LU decomposition. Do you happen to have an implementation of QR decomposition?

Thanks!
Last edited on
I have one that is so old it may as well be in C, using 'successive over relaxation'.

its expensive to compute, but the algorithm(s) are all fairly straight forward. I would just google it and create it from the tools in your library if possible.

your library has to have a solver in it. Have you tried it? you may not need QR, and it may already have one, and it may already be using it in the solver .... you are looking at recreating the wheel here.

Last edited on
@lo2
Here is a naive implementation of the QR factorisation described at
https://en.wikipedia.org/wiki/QR_decomposition

It is supposed to work with any rectangular matrix for which rows>=columns.

The code is hardcoded to the Wikipedia article, but you can input matrix A however you like.

The implementation is naive and is coded for clarity rather than speed, since many of the matrix multiplies are effectively for minors which are much smaller than the original matrix. Whether you want to optimise that depends on how big your matrices are.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
#include <iostream>
#include <iomanip>
#include <string>
#include <vector>
#include <cmath>
#include <cassert>
using namespace std;

const double SMALL = 1.0E-30;          // used to stop divide-by-zero

using vec    = vector<double>;         // vector
using matrix = vector<vec>;            // matrix (=collection of (row) vectors)

// Function prototypes
void printMatrix( const matrix &A );
matrix matmul( const matrix &A, const matrix &B );
matrix identity( int n );
matrix transpose( const matrix &A );
bool QR( const matrix &A, matrix &Q, matrix &R );

//======================================================================


int main()
{
   matrix A = { { 12, -51,   4 },
                {  6, 167, -68 },
                { -4,  24, -41 } };              // Example in Wikipedia
   cout << "\nA:\n";   printMatrix( A  );

   matrix Q, R;

   if ( QR( A, Q, R ) )                          // Calls the QR function
   {
      cout << "\nQ: \n";   printMatrix( Q  );
      cout << "\nR: \n";   printMatrix( R  );

      cout << "\n\nCHECKS:\n";
      cout << "\nQR:\n"  ;   printMatrix( matmul( Q, R ) );
      cout << "\nQ.QT:\n";   printMatrix( matmul( Q, transpose( Q ) ) );
   }
   else
   {
      cout << "Unable to factorise\n";
   }
}


//======================================================================


void printMatrix( const matrix &A )
{
   const double NEARZERO = 1.0E-10;     // interpretation of "zero" for printing purposes
   for ( auto &row : A )
   {
      for ( auto x : row )
      {
         if ( abs( x ) < NEARZERO ) x = 0.0;
         cout << setw( 12 ) << x;
      }
      cout << '\n';
   }
}


//======================================================================


matrix matmul( const matrix &A, const matrix &B )          // Matrix times matrix
{
   int rowsA = A.size(),   colsA = A[0].size();
   int rowsB = B.size(),   colsB = B[0].size();
   assert( colsA == rowsB );

   matrix C( rowsA, vec( colsB, 0.0 ) );
   for ( int i = 0; i < rowsA; i++ )
   {
      for ( int j = 0; j < colsB; j++ )
      {
         // scalar product of ith row of A and jth column of B
         for ( int k = 0; k < colsA; k++ ) C[i][j] += A[i][k] * B[k][j];
      }
   }
   return C;
}


//======================================================================


matrix identity( int n )                                   // n x n Identity matrix
{
   matrix I( n, vec( n, 0.0 ) );
   for ( int i = 0; i < n; i++ ) I[i][i] = 1.0;
   return I;
}


//======================================================================


matrix transpose( const matrix &A )                        // Transpose
{
   int rows = A.size(),   cols = A[0].size();
   matrix AT( cols, vec( rows ) );
   for ( int i = 0; i < cols; i++ )
   {
      for ( int j = 0; j < rows; j++ ) AT[i][j] = A[j][i];
   }
   return AT;
}


//======================================================================


bool QR( const matrix &A, matrix &Q, matrix &R )
// Factorise A = QR, where Q is orthogonal, R is upper triangular
{
   int rows = A.size(), cols = A[0].size();
   if ( rows < cols )
   {
      cout << "Algorithm only works for rows >= cols\n";
      return false;
   }

   R = A;
   matrix QT = identity( rows );

   for ( int k = 0; k < cols - 1; k++ )          // k is the working column
   {
      // X vector, based on the elements from k down in the kth column
      double alpha = 0;
      for ( int i = k; i < rows; i++ ) alpha += R[i][k] * R[i][k];
      alpha = sqrt( alpha );                     // alpha is the Euclidean norm of Xk

      // V vector ( normalise Xk - alpha e_k )
      vec V( rows, 0.0 );
      double Vnorm = 0.0;
      for ( int i = k + 1; i < rows; i++ )
      {
         V[i] = R[i][k];
         Vnorm += V[i] * V[i];
      }
      V[k] = R[k][k] - alpha;
      Vnorm += V[k] * V[k];
      Vnorm = sqrt( Vnorm );
      if ( Vnorm > SMALL ) 
      {
         for ( int i = k; i < rows; i++ ) V[i] /= Vnorm;
      }

      // Householder matrix: Qk = I - 2 V VT
      matrix Qk = identity( rows );
      for ( int i = k; i < rows; i++ )
      {
         for ( int j = k; j < rows; j++ ) Qk[i][j] -= 2.0 * V[i] * V[j];
      }

      QT = matmul( Qk, QT );
      R  = matmul( Qk, R  );
   }

   Q = transpose( QT );

   return true;
}


//====================================================================== 

A:
          12         -51           4
           6         167         -68
          -4          24         -41

Q: 
    0.857143   -0.394286    0.331429
    0.428571    0.902857  -0.0342857
   -0.285714    0.171429    0.942857

R: 
          14          21         -14
           0         175         -70
           0           0         -35


CHECKS:

QR:
          12         -51           4
           6         167         -68
          -4          24         -41

Q.QT:
           1           0           0
           0           1           0
           0           0           1
Last edited on
Topic archived. No new replies allowed.