mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Modelling typed vectors?
Date Thu, 14 Oct 2010 14:31:57 GMT
Yes.  We have discussed the random indexing algorithms before.  These
techniques are quite old.  I worked on them in the mid/late 90's and there
are substantial references to these methods and the underlying mathematics
back to the 80's.

The important reason why this is possibly useful at all is the concept of
quasi-orthogonality and the capacity of high dimensional spaces.

These algorithms essentially involve something like power iterations on the
connection matrix.  These algorithms can also be viewed as stochastic
gradient descent on a latent factor model.

The idea itself grew up in parallel with the decomposition based latent
factor ideas best typified by the LSI (latent semantic indexing) work and
carried forward in the form of LDA (latent dirichlet allocation), MDCA
(multiple discrete component analysis), NNMF (non-negative matrix
factorization) and so on.

It is indeed common in the random indexing community to start with uniformly
distributed values.  I view the imposition of this constraint as an error
and have had much better results starting with normally distributed values.
 My argument is that quasi-orthogonality does not hold with uniform positive
starting points.

In terms of an algorithm for decomposition or latent factor based learning,
I think that the Latent Factor Log Linear models as espoused by Menon and
Elkan have more promise, especially when combined with mixed ranking and
regression updates as described by Sculley.

Within Mahout, implementing a distributed learning framework for random
indexing is relatively easy.  You just need to do a distributed matrix
multiplication.

Refs:

http://www.deepdyve.com/search?query=combining+ranking+and+regression
http://arxiv.org/abs/1006.2156


On Wed, Oct 13, 2010 at 9:56 PM, Lance Norskog <goksron@gmail.com> wrote:

> Thank you. Passing the random seed rather than the projected matrix is
> a cool trick. I'm sure I would have figured it out before the heat
> death of the universe.
>
> Looking at it again, I can split out some of the processing into
> parallel chains and use the Vector classes. This would allow the
> output of this to play with other Mahout tools.
>
> The use case is something called 'semantic vectors' to store
> recommendations. The idea is to have two random vectors, one for users
> and one for items. All users are projected to random positions between
> 0 and 1. All items are also projected to random positions between 0
> and 1. Now, we have the users pull or push Items away from themselves,
> using the preference values. This creates an item vector which is
> slightly perturbed according to the user/item preferences. Drawing
> this displays the concept well; it is intuitively simple which is why
> I can understand it.
>
> Do this paired projection/perturbance 100-150-200 times and combine
> all of the user vectors and item vectors into parallel N-dimensional
> spaces. The rearrangements collectively create a reflection of the
> pref values: the ratios of a user's pref list should be roughly
> equivalent to the distance between the user's random vector and each
> item's perturbed vector. Now, the web of perturbance by all users
> against at least one item provides each user with a good distance to
> all items.
>
> Now, to recommend items for a user, do a nearest-neighbor check for
> the user against all items.
>
> This idea came from the "Semantic Vectors" project on Google code:
> http://code.google.com/p/semanticvectors/
> They use this algorithm for document/term collocation: projected
> documents perturb projected terms.
>
> Enough for this braindump.
>
> Lance
>
> On Wed, Oct 13, 2010 at 7:22 AM, Ted Dunning <ted.dunning@gmail.com>
> wrote:
> > See
> >
> http://tdunning.blogspot.com/2010/10/why-is-sum-of-two-uniform-randoms-not.html
> >
> > On Tue, Oct 12, 2010 at 11:07 PM, Ted Dunning <ted.dunning@gmail.com>
> wrote:
> >
> >> I will put up a more detailed explanation on my blog where I can draw
> >> pretty pictures and write mathematical notation, but the
> >> crux of the argument that if you are adding two random variables x and
> y,
> >> then the region where there is non-zero probability is
> >> the square [0,1] x [0,1].
> >>
> >
>
>
>
> --
> Lance Norskog
> goksron@gmail.com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message