In this second and half week I worked on the pcg to try to improve it.
All the codes can be found at https://bitbucket.org/CrisDorigo/socis16-octave-iter_meth/src
in the folder "pcg_codes".
The first thing that I made is the private function __checkandstring__ (the name is not definitive).
This function checks if the input argument A, b, M1 and M2 are consistent, i.e. if A, M1, M2 are matrices or function handle, and if A is a matrix, checks also that the number of columns of A are equal to the number of rows of b.
In addition, this function has as output two strings: is_prec and type.
The first one can be "prec" if M1 is not empty, or "no_prec" otherwise (I check only M1 because in the pcg documentation is written that if M1 is empty then no preconditioner is applied).
Instead the string type can be "matrix", "function_handle" or "mixed". If M1 empty then type is "matrix" if A is a matrix, "function handle" otherwise.
If M1 is not empty then the string type is "matrix" if A, M1, M2 matrices, "function_handle" if they are all function handle or "mixed" if not all of them are matrices or function handle.
I made this function as general as possible to use it also for the other algorithms that need an improvement in my project.
You can see the code in the function __checkandstring__.m.
After that I take the pcg Octave code and I adapted it to the different situation:
I made a switch-case that treat differently the three cases mentioned before ("matrix", "function handle" and "mixed"). Moreover the cases "matrix" and "function handle", have an if block that check if there exist the preconditioners: if they are not passed, then applies the unpreconditioned conjugate gradient, otherwise applies the preconditioned one.
For the "mixed" case instead, differently from my previous post (in which I said that if A, M1 and M2 are not of the same type, then I set them as functions handle) I check every time the type of A / M1 / M2 and if it is a matrix then I apply the simple matrix-vector product, if it is a function handle then I made the evaluation.
You can see this code in the function pcg_tmp_02.m.
I made this choice because I noticed that it is more (time) efficient than to transform all in functions handle. Indeed, before the two codes mentioned above, I made a function similar to __checkandstring__ that in case A, M1 and M2 not of the same type then transform them all in function handle, and then I made a version of pcg that works with this function (you can see these code in the files __check02__ and pcg_tmp_01 respectively). Then I tested the octave pcg, pcg_tmp_01 and pcg_tmp_02 20 times on the same example, with and without preconditioners, and passing all the variables as matrices, as function handles and not all of the same type. At the end I plotted the times and I compared it and I noticed that the better performs are given form pcg_tmp_02.
You can see this example in the file example_pcg.m.
Talking with the mentors, they told me that (using as example pcg_tmp_02) the code is too long. This because the pcg code is "repeated" 5 times:
- matrix case without preconditioners
- matrix case with preconditioners
- function handle case without preconditioners
- function handle case with preconditioners
- mixed case (it is only with preconditiones because otherwise there is only A and it is or a matrix or a function handle)
They suggested me to substitute this switch-case approach with another approach, for example to use nested function or to insert some scripts in the private folder. They told me to ask to all the maintainers which is your preferred approach or to suggest me another one in case.Complex case
Making some tests with complex matrices, I noticed strange behaviour of the pcg.
For example if I test the Octave pcg with the matrix
A = [2 , 2+1i, 4;
2-1i , 3, 1i;
4 , -1i , 1 ]
that is an hermitian matrix, but not positive definite (indeed its eigenvalue are not all positive: -3.16512506440744e+00 2.85307928516935e+00 6.31204577923809e+00 )
and with right-hand-side
b = [ 5.59693687377178e-01;
the Octave pcg doesn't notice that this matrix is not positive definite.
Indeed the result is:
x = pcg(A,b)
warning: pcg: maximum number of iterations (3) reached
warning: the initial residual norm was reduced 2.12736e+15 times.
2.04219909327648e-01 - 3.21781116721895e-02i
-1.11630690554651e-01 + 5.38299446813016e-02i
1.07086298628122e-01 + 1.70817561341068e-02i
Studying the code, I notice that the only check for the positive definitiveness of the matrix is on alpha. Not going into deep details, this alpha is necessary to compute the approximant solution at every step (indeed at every iteration, x = x + alpha*p for a certain p).
If there are no preconditioners (as in our example) alpha is defined as:
alpha = < r, r > / < p, A*p >
(where < , > is the vector scalar product, and r is the residual of the previous iteration).
Then, since we suppose that A is positive definite, alpha must be real and positive (both in case A real or complex matrix).
Since the numerator is positive for every r, if alpha is negative then surely A is not positive definite. Indeed the check in the Octave pcg code is:
if (alpha <= 0.0)
## Negative matrix.
matrix_positive_definite = false;
If A is not positive definite then < p, A*p> is not always negative, so if we are "lucky", if we pass a not positive definite matrix in pcg and from the computation alpha is positive at every iteration, there is nothing that we can do.
But if we use A and b definite as previous, pcg makes three iterations and the alpha are:
alpha = 2.11061741184511e-01 - 6.08511684384459e-20i
alpha = -5.59991838158184e-01 + 1.05798887633281e-17i
alpha = 1.48434184396776e-01 + 8.13133738226767e-18i
(the function pcg01.m in the folder "pcg_codes" is the octave version of pcg that print at every iteration alpha)
We notice that these three alpha have an imaginary part, but it is under the machine precision, so they are not computationally relevant.
But the second alpha is (obviously) negative, and the pcg doesn't notice it!
This because for Octave:
octave4.1-cli:29> -5.59991838158184e-01 + 1.05798887633281e-17i < 0
ans = 0
octave4.1-cli:30> -5.59991838158184e-01 + 1.05798887633281e-17i > 0
ans = 1
Making some tests I notice that a number with an imaginary part is greater than 0, also if this imaginary part is negative, indeed
octave4.1-cli:31> -3-1i > 0
ans = 1
I don't understand very well what Octave do to compare two complex numbers (I think that it compares the modules, i.e. if we want to verify if a < b, where at least one of a or b is complex, then it compares abs(a) and abs(b), but I'm not sure).
I compared some complex numbers also in Matlab, and I think that it compares only the real part (and "forgetting" the imaginary part).
I talked with the mentors about this fact, but there are some questions:
- Is it correct to check only the real part of alpha?
- Also if the imaginary part of alpha is relevant?
(if for example we use the matrix
A = [ 0.71290 + 0.59353i 0.97470 + 0.36591i 0.50060 + 0.53652i;
0.37411 + 0.11662i 0.38904 + 0.43489i 0.03555 + 0.23431i;
0.35482 + 0.23601i 0.44859 + 0.31402i 0.54356 + 0.72676i]
that is a matrix not symmetric, and not positive definite,
and b = ones(3,1), we obtain as alpha:
alpha = 0.47882 - 0.51052i
alpha = 1.4276 - 1.0314i
alpha = 0.43535 - 0.36971i
and the pcg doesn't give any flag of this "non-correctness" of A, but also if we check only the real part of alpha the code doesn't give any flag or any error)
- Do we break the algorithm if alpha is complex? Also if the imaginary part is under the machine precision?
(it is possible that also if A is positive definite and symmetric but the alpha are complex, because of computations and the machine precision)
The mentors suggested me two things:
- First of all to check both numerator and denominator of alpha: this because if there are preconditioners also the numerator can be negative, since it becomes <M*r, r>, where M is the inverse of M1*M2.
- To break the algorithm if the imaginary part of (the numerator or the denominator of) alpha is relevant. But, how to decide if it is relevant?
For example, 1e-15 is relevant? And 1e-14?
So they suggested me a possible criterion to use to determine the relevancy of this imaginary part:
if real(alpha) + (eps / tol)*imag(alpha) == real(alpha)
The motivation of this criterion is because, since we want to find a solution within a tolerance (tol), we "normalize" the relevancy of this imaginary part with this tolerance.
The mentors also told me to ask also to all the maintainers, because this is their suggestion, but can be different from the direction that wants the Octave community.
Then I please ask from the community some advices of how to proceed for this situation.
Actually (and until I'll have some advices) I'm working on the gmres, and I'll try to improve it in the same manner of the first part of this post for the pcg.
Please contact me with some feedback via e-mail at email@example.com or as a comment under this post.