mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Dimensional Reduction via Random Projection: investigations
Date Fri, 01 Jul 2011 18:03:01 GMT
Lance,

You would get better results from the random projection if you did the first
part of the stochastic SVD.  Basically, you do the random projection:

       Y = A \Omega

where A is your original data, R is the random matrix and Y is the result.
 Y will be tall and skinny.

Then, find an orthogonal basis Q of Y:

       Q R = Y

This orthogonal basis will be very close to the orthogonal basis of A.  In
fact, there are strong probabilistic guarantees on how good Q is as a basis
of A.  Next, you project A using the transpose of Q:

       B = Q' A

This gives you a short fat matrix that is the projection of A into a lower
dimensional space.  Since this is a left projection, it isn't quite what you
want in your work, but it is the standard way to phrase things.  The exact
same thing can be done with left random projection:

      Y = \Omega A
      L Q = Y
      B = A Q'

In this form, B is tall and skinny as you would like and Q' is essentially
an orthogonal reformulation of of the random projection.  This projection is
about as close as you are likely to get to something that exactly preserves
distances.  As such, you should be able to use MDS on B to get exactly the
same results as you want.

Additionally, if you start with the original form and do an SVD of B (which
is fast), you will get a very good approximation of the prominent right
singular vectors of A.  IF you do that, the first few of these should be
about as good as MDS for visualization purposes.

On Fri, Jul 1, 2011 at 2:44 AM, Lance Norskog <goksron@gmail.com> wrote:

> I did some testing and make a lot of pretty charts:
>
> http://ultrawhizbang.blogspot.com/
>
> If you want to get quick visualizations of your clusters, this is a
> great place to start.
>
> --
> Lance Norskog
> goksron@gmail.com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message