Jake, since we are on the topic, what's the running times of Lanczos
on a ~1G worth sequence file input might be?
On Wed, Apr 6, 2011 at 11:11 AM, Jake Mannix
On Thu, Mar 24, 2011 at 11:03 PM, Dmitriy Lyubimov
> wrote:
>> you can certainly try to write it out into a DRM (distributed row
>> matrix) and run stochastic SVD on hadoop (off the trunk now). see
>> MAHOUT593. This is suitable if you have a good decay of singular
>> values (but if you don't it probably just means you have so much noise
>> that it masks the problem you are trying to solve in your data).
> You don't need to run it as stochastic, either. The regular LanczosSolver
> will work on this data, if it lives as a DRM.
> jake
