mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <goks...@gmail.com>
Subject Re: Dimensional Reduction via Random Projection: investigations
Date Fri, 01 Jul 2011 23:22:46 GMT
Thanks!

Is this true? - "Preserving pairwise distances" means the relative
distances. So the ratios of new distance:old distance should be
similar. The standard deviation of the ratios gives a rough&ready
measure of the fidelity of the reduction. The standard deviation of
simple RP should be highest, then this RP + orthogonalization, then
MDS.

On Fri, Jul 1, 2011 at 11:03 AM, Ted Dunning <ted.dunning@gmail.com> wrote:
> Lance,
>
> You would get better results from the random projection if you did the first
> part of the stochastic SVD.  Basically, you do the random projection:
>
>       Y = A \Omega
>
> where A is your original data, R is the random matrix and Y is the result.
>  Y will be tall and skinny.
>
> Then, find an orthogonal basis Q of Y:
>
>       Q R = Y
>
> This orthogonal basis will be very close to the orthogonal basis of A.  In
> fact, there are strong probabilistic guarantees on how good Q is as a basis
> of A.  Next, you project A using the transpose of Q:
>
>       B = Q' A
>
> This gives you a short fat matrix that is the projection of A into a lower
> dimensional space.  Since this is a left projection, it isn't quite what you
> want in your work, but it is the standard way to phrase things.  The exact
> same thing can be done with left random projection:
>
>      Y = \Omega A
>      L Q = Y
>      B = A Q'
>
> In this form, B is tall and skinny as you would like and Q' is essentially
> an orthogonal reformulation of of the random projection.  This projection is
> about as close as you are likely to get to something that exactly preserves
> distances.  As such, you should be able to use MDS on B to get exactly the
> same results as you want.
>
> Additionally, if you start with the original form and do an SVD of B (which
> is fast), you will get a very good approximation of the prominent right
> singular vectors of A.  IF you do that, the first few of these should be
> about as good as MDS for visualization purposes.
>
> On Fri, Jul 1, 2011 at 2:44 AM, Lance Norskog <goksron@gmail.com> wrote:
>
>> I did some testing and make a lot of pretty charts:
>>
>> http://ultrawhizbang.blogspot.com/
>>
>> If you want to get quick visualizations of your clusters, this is a
>> great place to start.
>>
>> --
>> Lance Norskog
>> goksron@gmail.com
>>
>



-- 
Lance Norskog
goksron@gmail.com

Mime
View raw message