mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jake Mannix <jake.man...@gmail.com>
Subject Re: Lanczos Algorithm
Date Mon, 22 Nov 2010 22:01:31 GMT
Hi Pedro,

  I wrote the Lanczos stuff in Mahout, but I've been swamped with my new job
the past few months, and haven't been able to interact on the list much, but
I've got a minute or two to try and answer your questions.

  First: the LanczosSolver does indeed leave you with eigenvalues in the
opposite order than you expect (ascending order, instead of descending).  It
should be noted that in general, the Lanczos method *does not* give you the
"largest N eigenvalues" - if you ask for a rank-N decomposition via Lanczos,
you'll get the smallest eigenvalue for sure, then somewhat less than N/2 of
the other small eigenvalues (with the last few being right in the middle of
the spectrum), then somewhat less than N/2 of the largest eigenvalues (with
the first few being in the middle of the spectrum), ending with the largest
eigenvalue.

If you really want the top N eigenvalues, then you should ask for somewhere
between 3N and 4N rank decomposition, and take the last N of these.  You
won't be guaranteed to get *exactly all* of the top N eigenvalues,
especially if you have a very rapidly decreasing eigenvalue spectrum (as is
usually the case with real data), and you reach the "plateau" of middling
values before N.  In this case, if there are K<N "large" eigenvalues, you'll
get all K of these, and then N-K's worth of a sampling of the middling
eigenvalues.

One other thing you need to remember about the LanczosSolver: if your input
matrix is symmetric, pass in the boolean isSymmetric=true to the solve()
method, or else you'll get wrong values.

  -jake

On Mon, Nov 22, 2010 at 1:07 PM, PEDRO MANUEL JIMENEZ RODRIGUEZ <
pmjimenez1983@hotmail.com> wrote:

>
> Hi Ted,
>
> I can't give you an exact amount but more or less it could be around 10^5
> non-zero elements per row.
>
> Could you please let me know, why the lanzcos algorithm is not always
> returning the values in a decreasing order?
>
> Thanks.
>
> Pedro.
>
> ----------------------------------------
> > From: ted.dunning@gmail.com
> > Date: Fri, 19 Nov 2010 13:34:19 -0800
> > Subject: Re: Lanczos Algorithm
> > To: user@mahout.apache.org
> >
> > How many non-zero elements?
> >
> > On Fri, Nov 19, 2010 at 12:34 PM, PEDRO MANUEL JIMENEZ RODRIGUEZ <
> > pmjimenez1983@hotmail.com> wrote:
> >
> > >
> > >
> > > I was talking about 10^9 rows and 10^9 columns
> > >
> > > ----------------------------------------
> > > > From: ted.dunning@gmail.com
> > > > Date: Fri, 19 Nov 2010 12:07:16 -0800
> > > > Subject: Re: Lanczos Algorithm
> > > > To: user@mahout.apache.org
> > > >
> > > > On Fri, Nov 19, 2010 at 11:17 AM, PEDRO MANUEL JIMENEZ RODRIGUEZ <
> > > > pmjimenez1983@hotmail.com> wrote:
> > > >
> > > > > In this project I would have to work with matrix of 10^9, which
> have a
> > > very
> > > > > sparse data.
> > > >
> > > >
> > > > I think you mean 10^9 rows and 10^9 columns with much fewer 10^18
> > > non-zero
> > > > elements.
> > > >
> > > > Is that correct?
> > > >
> > > > Can you say how many non-zero elements?
> > >
> > >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message