Hi Pedro,
I wrote the Lanczos stuff in Mahout, but I've been swamped with my new job
the past few months, and haven't been able to interact on the list much, but
I've got a minute or two to try and answer your questions.
First: the LanczosSolver does indeed leave you with eigenvalues in the
opposite order than you expect (ascending order, instead of descending). It
should be noted that in general, the Lanczos method *does not* give you the
"largest N eigenvalues" - if you ask for a rank-N decomposition via Lanczos,
you'll get the smallest eigenvalue for sure, then somewhat less than N/2 of
the other small eigenvalues (with the last few being right in the middle of
the spectrum), then somewhat less than N/2 of the largest eigenvalues (with
the first few being in the middle of the spectrum), ending with the largest
eigenvalue.
If you really want the top N eigenvalues, then you should ask for somewhere
between 3N and 4N rank decomposition, and take the last N of these. You
won't be guaranteed to get *exactly all* of the top N eigenvalues,
especially if you have a very rapidly decreasing eigenvalue spectrum (as is
usually the case with real data), and you reach the "plateau" of middling
values before N. In this case, if there are K wrote:
>
> Hi Ted,
>
> I can't give you an exact amount but more or less it could be around 10^5
> non-zero elements per row.
>
> Could you please let me know, why the lanzcos algorithm is not always
> returning the values in a decreasing order?
>
> Thanks.
>
> Pedro.
>
> ----------------------------------------
> > From: ted.dunning@gmail.com
> > Date: Fri, 19 Nov 2010 13:34:19 -0800
> > Subject: Re: Lanczos Algorithm
> > To: user@mahout.apache.org
> >
> > How many non-zero elements?
> >
> > On Fri, Nov 19, 2010 at 12:34 PM, PEDRO MANUEL JIMENEZ RODRIGUEZ <
> > pmjimenez1983@hotmail.com> wrote:
> >
> > >
> > >
> > > I was talking about 10^9 rows and 10^9 columns
> > >
> > > ----------------------------------------
> > > > From: ted.dunning@gmail.com
> > > > Date: Fri, 19 Nov 2010 12:07:16 -0800
> > > > Subject: Re: Lanczos Algorithm
> > > > To: user@mahout.apache.org
> > > >
> > > > On Fri, Nov 19, 2010 at 11:17 AM, PEDRO MANUEL JIMENEZ RODRIGUEZ <
> > > > pmjimenez1983@hotmail.com> wrote:
> > > >
> > > > > In this project I would have to work with matrix of 10^9, which
> have a
> > > very
> > > > > sparse data.
> > > >
> > > >
> > > > I think you mean 10^9 rows and 10^9 columns with much fewer 10^18
> > > non-zero
> > > > elements.
> > > >
> > > > Is that correct?
> > > >
> > > > Can you say how many non-zero elements?
> > >
> > >
>
>