Hugo Ferrira commented on MATH1403:

Hello Gilles,
Thanks for the feedback. Unfortunately I am not knowledgeable enough to tackle this task.
Finally, I confirmed that the original R code uses the BLAS library. Its implementation
is also a rank revealing QR decomposition. What I find interesting is that the rank value
is obtained after the decomposition and no explicit function is called. So these
don't seem to be implementations of the same algorithm.
As I said, I don't know much about numerical methods. However, if someone can
point me to a simple description of an algorithm I could try and debug it.
Thanks
Collinearity test: QR Decomposition rank incorrect (SVD ok)
> 
>
Key: MATH1403
URL: https://issues.apache.org/jira/browse/MATH1403
Project: Commons Math
Issue Type: Bug
Affects Versions: 3.6.1
Environment: Linux ubuntu
JDK 8
Reporter: Hugo Ferrira
>
> Hello,
> I am aware that such a question have been asked before but I cannot seem to solve this
issue for a very simple example. The closest example I have is:
> https://issues.apache.org/jira/browse/MATH1100
> from which I could not get an answer.
> I am trying to copy an algorithm from R's Caret package that identifies collinear columns
of a matrix [1]. I am assuming a "long" matrix and and am using the trivial example from the
reference above. However I cannot get this to work because the QR's rank result is incorrect.
> I have the following example:
> import org.apache.commons.math3.linear.RealMatrix;
> import org.apache.commons.math3.linear.RRQRDecomposition;
> import org.apache.commons.math3.linear.Array2DRowRealMatrix;
> import org.apache.commons.math3.linear.SingularValueDecomposition ;
> public class QRIssue {
> public static void main(String[] args) {
> double[][] am = new double[5][];
> double[] c1 = new double[] {1.0, 1.0, 1.0, 1.0, 1.0, 1.0} ;
> double[] c2 = new double[] {1.0, 1.0, 1.0, 0.0, 0.0, 0.0} ;
> double[] c3 = new double[] {0.0, 0.0, 0.0, 1.0, 1.0, 1.0} ;
> double[] c4 = new double[] {1.0, 0.0, 0.0, 1.0, 0.0, 0.0 } ;
> double[] c6 = new double[] {0.0, 0.0, 1.0, 0.0, 0.0, 1.0 } ;
> am[0] = c1 ;
> am[1] = c2 ;
> am[2] = c3 ;
> am[3] = c4 ;
> am[4] = c6 ;
> Double threshold = 1e1;
> Array2DRowRealMatrix m = new Array2DRowRealMatrix( am, false ) ; // use array, don't
copy
> RRQRDecomposition qr = new RRQRDecomposition( m, threshold) ;
> RealMatrix r = qr.getR() ;
> int numColumns = r.getColumnDimension() ;
> int rank = qr.getRank( threshold ) ;
> System.out.println("QR rank: " + rank) ;
> System.out.println("QR is singular: " + !qr.getSolver().isNonSingular()) ;
> System.out.println("QR is singular: " + (numColumns == rank) ) ;
> SingularValueDecomposition sv2 = new org.apache.commons.math3.linear.SingularValueDecomposition(m);
> System.out.println("SVD rank: " + sv2.getRank()) ;
> }
> }
> For SVD I get a rank of 4 which is correct (columns 0,1,2 are collinear : c0 = c1 + c2).
But for QR I get 5. I have tried several thresholds with no success. For several subsets of
the columns above (example only 0,1,2 I get the correct answer). What am I doing wrong?
> TIA,
> Hugo F.
> 1. https://topepo.github.io/caret/preprocessing.html#lindep

