mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Holtermann <i...@markusholtermann.eu>
Subject Re: Singular Value Decomposition does not return correct eigenvalues and -vectors
Date Fri, 23 Sep 2011 22:42:49 GMT
Thank you for all your responses.

ref. Dan Brickley:
------------------
hopefully you did dream ;-)

ref. Dmitriy Lyubimov:
----------------------
When I run `mahout ssvd -i A.seq -o A-ssvd/ -k 3 -p 0` I get an
IllegalArgumentException. You can find the traceback at
http://paste.pocoo.org/show/481168/ .

ref. Ted Dunning:
-----------------
I am running the M/R version of SVD in local mode. I didn't install
Hadoop except what is coming via `mvn install`.
If I understand the code correctly, the `--inMemory` argument is only
relevant for the "EigenVerificationJob" -- I didn't run that.

Here are the latest results for the calculations as described in my
previous mail:

For 1:
Key class: class org.apache.hadoop.io.IntWritable
Value Class: class org.apache.mahout.math.VectorWritable
Key: 0: Value: eigenVector0, eigenvalue = 11.344411508600611:
{0:0.8940505788976013,1:0.05761556873901637,2:-0.44424543735613486}
Key: 1: Value: eigenVector1, eigenvalue = 0.0:
{0:-0.3030457633656634,1:0.8081220356417685,2:-0.5050762722761053}
Key: 2: Value: eigenVector2, eigenvalue = -0.4362482432944815:
{0:0.3299042704770375,1:0.5861904313011974,2:0.7399621277956934}
Count: 3

For 2:
Key class: class org.apache.hadoop.io.IntWritable
Value Class: class org.apache.mahout.math.VectorWritable
Key: 0: Value: eigenVector0, eigenvalue = 11.344814282762082:
{0:0.7369762290995766,1:0.3279852776056837,2:-0.5910090485061045}
Key: 1: Value: eigenVector1, eigenvalue = 0.17091518882717976:
{0:0.9225878132457447,1:0.3812202473600341,2:0.05918487858557608}
Key: 2: Value: eigenVector2, eigenvalue = 0.0:
{0:-0.5910090485061055,1:0.7369762290995774,2:-0.3279852776056802}
Key: 3: Value: eigenVector3, eigenvalue =
-0.5157294715892533:{0:-0.32798527760568197,1:-0.5910090485061036,2:-0.7369762290995783}
Count: 4

For 3:
Key class: class org.apache.hadoop.io.IntWritable
Value Class: class org.apache.mahout.math.VectorWritable
Key: 0: Value: eigenVector0, eigenvalue = 11.344814080004587:
{0:0.2870124314018251,1:-0.8054865010309287,2:0.5184740696291035}
Key: 1: Value: eigenVector1, eigenvalue = 0.4852290375835231:
{0:0.9000472484774761,1:0.041469409433508436,2:-0.4338147514658307}
Key: 2: Value: eigenVector2, eigenvalue = 0.0:
{0:0.3279311127797073,1:0.5911613863727806,2:0.7368781449689461}
Count: 3

For 4:
Key class: class org.apache.hadoop.io.IntWritable
Value Class: class org.apache.mahout.math.VectorWritable
Key: 0: Value: eigenVector0, eigenvalue = 11.34481428276208:
{0:0.788451139115581,1:0.5058848349238699,2:0.3498933194866569}
Key: 1: Value: eigenVector1, eigenvalue = 0.5157294715892401:
{0:-0.5910090485061453,1:0.7369762290995597,2:-0.32798527760564816}
Key: 2: Value: eigenVector2, eigenvalue = 0.1709151888272022:
{0:-0.7369762290995447,1:-0.3279852776057236,2:0.5910090485061223}
Key: 3: Value: eigenVector3, eigenvalue = 0.0:
{0:-0.3279852776056819,1:-0.5910090485061036,2:-0.7369762290995783}
Count: 4

For 5:
Key class: class org.apache.hadoop.io.IntWritable
Value Class: class org.apache.mahout.math.VectorWritable
Key: 0: Value: eigenVector0, eigenvalue = 7.7949818262315:
{0:-0.3998289016610171,1:0.3486764982772797,2:0.8476800982361441}
Key: 1: Value: eigenVector1, eigenvalue = 0.0:
{0:0.3244428422615253,1:-0.8111071056538125,2:0.4866642633922878}
Key: 2: Value: eigenVector2, eigenvalue = -2.2686660367578133:
{0:0.8572477421969729,1:0.4696061783100697,2:0.21117846905213422}
Count: 3

For 6:
Key class: class org.apache.hadoop.io.IntWritable
Value Class: class org.apache.mahout.math.VectorWritable
Key: 0: Value: eigenVector0, eigenvalue = 9.903422603237882:
{0:-0.305869782876591,1:-0.012493432384138303,2:0.9519913813004245}
Key: 1: Value: eigenVector1, eigenvalue = 6.002722238353203:
{0:-0.7781330995244824,1:0.06366543541563939,2:0.624864458709054}
Key: 2: Value: eigenVector2, eigenvalue = 0.0:
{0:0.2988138112963618,1:0.9481291552697455,2:0.10845003967736172}
Key: 3: Value: eigenVector3, eigenvalue = -3.906144841591079:
{0:0.9039656974142156,1:-0.3176397630567398,2:0.2862708487144453}
Count: 4

For 7:
Key class: class org.apache.hadoop.io.IntWritable
Value Class: class org.apache.mahout.math.VectorWritable
Key: 0: Value: eigenVector0, eigenvalue = 7.04924152040162:
{0:-0.4082482904638631,1:0.8164965809277261,2:-0.4082482904638631}
Key: 1: Value: eigenVector1, eigenvalue = 3.782617346103868:
{0:0.7808892910047764,1:0.08072916428282848,2:-0.6194309624391194}
Key: 2: Value: eigenVector2, eigenvalue = 0.0:
{0:0.47280571964327067,1:0.5716783495703939,2:0.6705509794975171}
Count: 3

For 8:
Key class: class org.apache.hadoop.io.IntWritable
Value Class: class org.apache.mahout.math.VectorWritable
Key: 0: Value: eigenVector0, eigenvalue = 7.964450219004663:
{0:NaN,1:NaN,2:NaN}
Key: 1: Value: eigenVector1, eigenvalue = 7.000000000000002:
{0:NaN,1:NaN,2:NaN}
Key: 2: Value: eigenVector2, eigenvalue = 0.753347668076679:
{0:NaN,1:NaN,2:NaN}
Key: 3: Value: eigenVector3, eigenvalue = 0.0:
{0:NaN,1:NaN,2:NaN}
Count: 4


ref. Danny Bickson:
-------------------
Thanks for your confirmation on how to use the rank.
Regarding the scale factor and orthogonalization: Yes, I take it into
account. I'm running SVD from trunk without any changes. And even after
commenting out those parts of the code, the results are still wrong in
the cases 1, 2, 3, 7 and 8

Thank you for your help.

Markus


> On 22 Sep 2011, at 18:37, Markus Holtermann 
> <info@markusholtermann.eu> wrote:
> 
>> Hello there,
>> 
>> I'm trying to run Mahout's Singular Value Decomposition but 
>> realized, that the resulting eigenvalues are wrong in most cases. 
>> So I took two small 3x3 matrices and calculated their eigenvalues 
>> and eigenvectors by hand and compared the results to Mahout.
>> 
>> Only in one of eight cases the results for Mahout and my pen & 
>> paper matched.
>> 
>> Lets take A = {{1,2,3},{2,4,5},{3,5,6}} and B = 
>> {{5,2,4},{-3,6,2},{3,-3,1}}
>> 
>> As you can see, A is symmetric, B is not.
>> 
>> I ran `mahout svd --output out/ --numRows 3 --numCols 3` eight 
>> times with different arguments:
>> 
>> 1) --input A --rank 3 --symmetric true    result is wrong 2) 
>> --input A --rank 4 --symmetric true    result is wrong 3) --input
>> A --rank 3 --symmetric false   result is wrong 4) --input A --rank
>> 4 --symmetric false   result is CORRECT
>> 
>> 5) --input B --rank 3 --symmetric true    result is wrong 6) 
>> --input B --rank 4 --symmetric true    result is wrong 7) --input
>> B --rank 3 --symmetric false   result is wrong 8) --input B --rank
>> 4 --symmetric false   result is wrong
>> 
>> To verify that my input data is correct, this is the result of 
>> `mahout seqdumper`
>> 
>> For A: Key class: class org.apache.hadoop.io.IntWritable Value 
>> Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: 
>> {0:1.0,1:2.0,2:3.0} Key: 1: Value: {0:2.0,1:4.0,2:5.0} Key: 2: 
>> Value: {0:3.0,1:5.0,2:6.0} Count: 3
>> 
>> 
>> For B: Key class: class org.apache.hadoop.io.IntWritable Value 
>> Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: 
>> {0:5.0,1:2.0,2:4.0} Key: 1: Value: {0:-3.0,1:6.0,2:2.0} Key: 2: 
>> Value: {0:3.0,1:-3.0,2:1.0} Count: 3
>> 
>> 
>> And finally, the correct eigenvalues should be: For A: λ1 = 11.3448
>> λ2 = -0.515729 λ3 = 0.170915
>> 
>> For B: λ1 = 7 λ2 = 3 λ3 = 2
>> 
>> So, are there any known bugs in Mahout's SVD implementation? Am I 
>> doing something wrong? Is this algorithm known to produce wrong 
>> results?
>> 
>> Thanks in advance.
>> 
>> Markus


Mime
View raw message