mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Dimensional Reduction via Random Projection: investigations
Date Fri, 01 Jul 2011 23:54:27 GMT
Yes.  Been there.  Done that.

The correlation is stunningly good.

On Fri, Jul 1, 2011 at 4:22 PM, Lance Norskog <goksron@gmail.com> wrote:

> Thanks!
>
> Is this true? - "Preserving pairwise distances" means the relative
> distances. So the ratios of new distance:old distance should be
> similar. The standard deviation of the ratios gives a rough&ready
> measure of the fidelity of the reduction. The standard deviation of
> simple RP should be highest, then this RP + orthogonalization, then
> MDS.
>
> On Fri, Jul 1, 2011 at 11:03 AM, Ted Dunning <ted.dunning@gmail.com>
> wrote:
> > Lance,
> >
> > You would get better results from the random projection if you did the
> first
> > part of the stochastic SVD.  Basically, you do the random projection:
> >
> >       Y = A \Omega
> >
> > where A is your original data, R is the random matrix and Y is the
> result.
> >  Y will be tall and skinny.
> >
> > Then, find an orthogonal basis Q of Y:
> >
> >       Q R = Y
> >
> > This orthogonal basis will be very close to the orthogonal basis of A.
>  In
> > fact, there are strong probabilistic guarantees on how good Q is as a
> basis
> > of A.  Next, you project A using the transpose of Q:
> >
> >       B = Q' A
> >
> > This gives you a short fat matrix that is the projection of A into a
> lower
> > dimensional space.  Since this is a left projection, it isn't quite what
> you
> > want in your work, but it is the standard way to phrase things.  The
> exact
> > same thing can be done with left random projection:
> >
> >      Y = \Omega A
> >      L Q = Y
> >      B = A Q'
> >
> > In this form, B is tall and skinny as you would like and Q' is
> essentially
> > an orthogonal reformulation of of the random projection.  This projection
> is
> > about as close as you are likely to get to something that exactly
> preserves
> > distances.  As such, you should be able to use MDS on B to get exactly
> the
> > same results as you want.
> >
> > Additionally, if you start with the original form and do an SVD of B
> (which
> > is fast), you will get a very good approximation of the prominent right
> > singular vectors of A.  IF you do that, the first few of these should be
> > about as good as MDS for visualization purposes.
> >
> > On Fri, Jul 1, 2011 at 2:44 AM, Lance Norskog <goksron@gmail.com> wrote:
> >
> >> I did some testing and make a lot of pretty charts:
> >>
> >> http://ultrawhizbang.blogspot.com/
> >>
> >> If you want to get quick visualizations of your clusters, this is a
> >> great place to start.
> >>
> >> --
> >> Lance Norskog
> >> goksron@gmail.com
> >>
> >
>
>
>
> --
> Lance Norskog
> goksron@gmail.com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message