mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <>
Subject Re: Dimensional Reduction via Random Projection: investigations
Date Fri, 01 Jul 2011 23:22:46 GMT

Is this true? - "Preserving pairwise distances" means the relative
distances. So the ratios of new distance:old distance should be
similar. The standard deviation of the ratios gives a rough&ready
measure of the fidelity of the reduction. The standard deviation of
simple RP should be highest, then this RP + orthogonalization, then

On Fri, Jul 1, 2011 at 11:03 AM, Ted Dunning <> wrote:
> Lance,
> You would get better results from the random projection if you did the first
> part of the stochastic SVD.  Basically, you do the random projection:
>       Y = A \Omega
> where A is your original data, R is the random matrix and Y is the result.
>  Y will be tall and skinny.
> Then, find an orthogonal basis Q of Y:
>       Q R = Y
> This orthogonal basis will be very close to the orthogonal basis of A.  In
> fact, there are strong probabilistic guarantees on how good Q is as a basis
> of A.  Next, you project A using the transpose of Q:
>       B = Q' A
> This gives you a short fat matrix that is the projection of A into a lower
> dimensional space.  Since this is a left projection, it isn't quite what you
> want in your work, but it is the standard way to phrase things.  The exact
> same thing can be done with left random projection:
>      Y = \Omega A
>      L Q = Y
>      B = A Q'
> In this form, B is tall and skinny as you would like and Q' is essentially
> an orthogonal reformulation of of the random projection.  This projection is
> about as close as you are likely to get to something that exactly preserves
> distances.  As such, you should be able to use MDS on B to get exactly the
> same results as you want.
> Additionally, if you start with the original form and do an SVD of B (which
> is fast), you will get a very good approximation of the prominent right
> singular vectors of A.  IF you do that, the first few of these should be
> about as good as MDS for visualization purposes.
> On Fri, Jul 1, 2011 at 2:44 AM, Lance Norskog <> wrote:
>> I did some testing and make a lot of pretty charts:
>> If you want to get quick visualizations of your clusters, this is a
>> great place to start.
>> --
>> Lance Norskog

Lance Norskog

View raw message