mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: eigendecomposition of very large matrices
Date Sun, 29 Jul 2012 05:22:22 GMT
The algorithm used doesn't change this.  If U S V' = A is the SVD of A, then

   A' A = (U S V')' U S V'
          = V S U' U S V'
          = V S^2 V'



On Thu, Jul 26, 2012 at 4:31 PM, John Stewart <jds@wgen.net> wrote:

> With Lanczos, the eigenvectors of A'A give you the orthogonal matrix V of
> SVD, and the eigenvalues give you the singular values (usually named as
> Sigma or S).  To get U you then back multiply: U = AV(S^-1)
>
> jds
>
> On 7/26/12 5:32 PM, "Dmitriy Lyubimov" <dlieu.7@gmail.com> wrote:
>
> >wikipedia article implies that if A is positive definite (and normal
> >which is true for symmetric) then SVD is the same as
> >eigendecomposition.
> >
> >however being positive definite is a fairly strong requirement...
> >
> >On Thu, Jul 26, 2012 at 2:06 PM, Aniruddha Basak <t-abasak@expedia.com>
> >wrote:
> >> If A is symmetric, I observe from small tests in Matlab, the eigenvalues
> >> and singular values are same (might not always be true) in magnitude.
> >> Similarly eigenvectors and columns of V are same but some have opposite
> >>sign.
> >>
> >> Now, there are two questions,
> >> 1. How to retrieve eigenvectors with correct sign from U or V?
> >> 2. If we ignore this difference, and perform the next steps of spectral
> >>kmeans, will the results be correct?
> >>
> >>
> >> Aniruddha
> >>
> >>
> >> -----Original Message-----
> >> From: Dan Brickley [mailto:danbri@danbri.org]
> >> Sent: Thursday, July 26, 2012 1:56 PM
> >> To: user@mahout.apache.org
> >> Cc: user@mahout.apache.org
> >> Subject: Re: eigendecomposition of very large matrices
> >>
> >>
> >>
> >>
> >>
> >> On 26 Jul 2012, at 21:21, Aniruddha Basak <t-abasak@expedia.com> wrote:
> >>
> >>> Actually that's my confusion. I don't need the eigenvectors of AA'
> >>> but of A !
> >>> If I can find a matrix B such that BB'=A, then from the SVD
> >>> decomposition of B we can get the eigenvectors of A. But how to get B
> >>>in that case ?
> >>
> >> Does it help us if A is symmetric?
> >>
> >> Dan
> >>
> >>>
> >>> Aniruddha
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: Dmitriy Lyubimov [mailto:dlieu.7@gmail.com]
> >>> Sent: Thursday, July 26, 2012 1:18 PM
> >>> To: user@mahout.apache.org
> >>> Subject: Re: eigendecomposition of very large matrices
> >>>
> >>> See http://en.wikipedia.org/wiki/Singular_value_decomposition,
> >>> "relation to eigenvalue decomposition".
> >>>
> >>> Depending on what you consider source for the eigendecompostion, AA'
> >>> or A'A, the eigenvectors would be column vectors of U or V
> >>>respectively.
> >>>
> >>> On Thu, Jul 26, 2012 at 1:12 PM, Aniruddha Basak
> >>><t-abasak@expedia.com> wrote:
> >>>> Hi,
> >>>> I am trying to use SSVD instead of Lanczos, as a part of Spectral
> >>>>Kmeans.
> >>>> However, I could not find the relation between the eigenvectors and
> >>>>U, V matrices.
> >>>> Can someone please tell me, how to retrieve the eigenvectors from
> >>>>SSVD decomposition ?
> >>>>
> >>>> Thanks,
> >>>> Aniruddha
> >>>>
> >>>>
> >>>>
> >>>> -----Original Message-----
> >>>> From: Dmitriy Lyubimov [mailto:dlieu.7@gmail.com]
> >>>> Sent: Thursday, July 19, 2012 10:53 PM
> >>>> To: user@mahout.apache.org
> >>>> Subject: RE: eigendecomposition of very large matrices
> >>>>
> >>>> Pps if you do insist on having a lot of k then you'll benefit from
> >>>>smaller hdfs block size, not larger.
> >>>> On Jul 19, 2012 10:50 PM, "Dmitriy Lyubimov" <dlieu.7@gmail.com>
> >>>>wrote:
> >>>>
> >>>>> Yeah I see OK. Both two experiments conducted with mahout ssvd I
am
> >>>>> familiar with dealt with input size greater than yours element wise,
> >>>>> on a quite modest node count. So i don't think your input size will
> >>>>> be a problem. But the number of singular values will be.
> >>>>>
> >>>>> But I doubt any input will yield anything useful beyond k=200 but
> >>>>> statistical noise. Even if you have a good decay of the singular
> >>>>>values.
> >>>>> But I bet you don't need that many. You can fit significantly more
> >>>>> 'clusters' on a 'fairly small' dimensional space.
> >>>>> On Jul 19, 2012 6:33 PM, "Aniruddha Basak" <t-abasak@expedia.com>
> >>>>>wrote:
> >>>>>
> >>>>>> Thanks Dmitriy for your reply.
> >>>>>> The matrix I am working on, has 10-20 non zero entries per row.
So
> >>>>>> its very sparse.
> >>>>>> I am trying to do spectral clustering which involves
> >>>>>>eigen-decomposition.
> >>>>>> I am wondering whether anyone has tried to do spectral clustering
> >>>>>> using mahout for very large affinity matrix (input).
> >>>>>>
> >>>>>> Aniruddha
> >>>>>>
> >>>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Dmitriy Lyubimov [mailto:dlieu.7@gmail.com]
> >>>>>> Sent: Thursday, July 19, 2012 6:28 PM
> >>>>>> To: user@mahout.apache.org
> >>>>>> Subject: Re: eigendecomposition of very large matrices
> >>>>>>
> >>>>>> very significant sparsity may be a problem though for -q >=1
> >>>>>>parameters.
> >>>>>> Again, depends on the hardware you have and the # of non-zero
> >>>>>> elements in the input. but -q=1 is still the most recommended
> >>>>>>setting here.
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Jul 19, 2012 at 6:20 PM, Dmitriy Lyubimov
> >>>>>> <dlieu.7@gmail.com>
> >>>>>> wrote:
> >>>>>>> you may try SSVD.
> >>>>>>> https://cwiki.apache.org/confluence/display/MAHOUT/Stochastic+Sing
> >>>>>>> u
> >>>>>>> lar
> >>>>>>> +Value+Decomposition
> >>>>>>>
> >>>>>>> but 4k eigenvectors (or, rather, singular values) is kind
of still
> >>>>>>> a lot though and may push the precision out of the error
estimates.
> >>>>>>> I don't we had precision study for that many. Also need
quite a
> >>>>>>> bit of memory to compute that (not to mention flops). More
> >>>>>>> realistically you probably may try 1k singular values .
You may
> >>>>>>> try more if you have access to more powerful hardware than
we did
> >>>>>>> in the studies but distributed computation time will grow
at about
> >>>>>>> k^1.5, i.e. faster than linear, even if you have enough
nodes for
> >>>>>>>the tasks.
> >>>>>>>
> >>>>>>> -d
> >>>>>>>
> >>>>>>> On Thu, Jul 19, 2012 at 6:12 PM, Aniruddha Basak
> >>>>>>> <t-abasak@expedia.com>
> >>>>>> wrote:
> >>>>>>>> Hi,
> >>>>>>>> I am working on a clustering problem which involves
determining
> >>>>>>>> the largest "k" eigenvectors of a very large matrix.
The
> >>>>>>>> matrices, I work on, are typically of the order of 10^6
by 10^6.
> >>>>>>>> Trying to do this using the Lanczos solver available
in Mahout, I
> >>>>>>>> found it is very slow and takes around 1.5 minutes to
compute
> >>>>>>>> each
> >>>>>> eigenvectors.
> >>>>>>>> Hence to get 4000 eigenvectors, it takes 100 hours or
4 days !!
> >>>>>>>>
> >>>>>>>> So I am looking for something faster to solve the "Eigen
> >>>>>>>>decomposition"
> >>>>>>>> problem for very large sparse matrix. Please suggest
me what
> >>>>>>>> should I
> >>>>>> use ?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Aniruddha
> >>>>>>>>
> >>>>>>
> >>>>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message