Jake, since we are on the topic, what's the running times of Lanczos
on a ~1G worth sequence file input might be?
On Wed, Apr 6, 2011 at 11:11 AM, Jake Mannix <jake.mannix@gmail.com> wrote:
>
>
> On Thu, Mar 24, 2011 at 11:03 PM, Dmitriy Lyubimov <dlieu.7@gmail.com>
> wrote:
>>
>> you can certainly try to write it out into a DRM (distributed row
>> matrix) and run stochastic SVD on hadoop (off the trunk now). see
>> MAHOUT593. This is suitable if you have a good decay of singular
>> values (but if you don't it probably just means you have so much noise
>> that it masks the problem you are trying to solve in your data).
>
> You don't need to run it as stochastic, either. The regular LanczosSolver
> will work on this data, if it lives as a DRM.
>
> jake
