mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From agnonchik <>
Subject Lanczos SVD scalability
Date Tue, 05 Jul 2011 06:27:07 GMT
What could be the reason of a poor Lanczos SVD scalability on cluster? I
don't observe any speed-up at all increasing the number of nodes. What am I
doing wrong?

I'm processing a 10000x1000 matrix with 1% non-zeros. The elapsed CPU time
scales like this:
1 slave node - 89m39.399s
2 slave nodes - 93m47.435s
8 slave nodes - 89m20.821s

I checked the output, cleanEigenvectors - they are mathematically correct.

Cluster specs:
Intel Core2 Duo E7200 @ 2.53 GHz CPUs
Gigabit Ethernet
each node has 80GB hard drive

I saved the matrix in the sequential format to HDFS. Should I save it in
another format to be processed in parallel?

View this message in context:
Sent from the Mahout User List mailing list archive at

View raw message