mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Hansen <dsche...@gmail.com>
Subject Re: Singular vectors of a recommendation Item-Item space
Date Thu, 25 Aug 2011 21:21:56 GMT
Well, I think my problem may have had more to do with what I was calling the
eigenvector...  I was referring to the rows rather than the columns of U and
V.  While the columns may be characteristic of the overall matrix, the rows
are characteristic of the user or item (in that they are a rank reduced
representation of that person or thing). I guess you could say I just had to
tilt my head to the side and change my perspective 90 degrees =)

On Thu, Aug 25, 2011 at 3:57 PM, Jake Mannix <jake.mannix@gmail.com> wrote:

> On Thu, Aug 25, 2011 at 1:53 PM, Jeff Hansen <dscheffy@gmail.com> wrote:
>
> > By the way, please ignore my use of the term eigenvector -- I have a
> > feeling
> > I completely misused it.  I've never quite understood the concept, but to
> > me
> > that truncated 10 value long vector that corresponds to a movie seems to
> be
> > "characteristic" of it (which is what the language eigen was always
> > intended
> > to convey.
> >
>
> It actually *is* an eigenvector, you're not wrong.
>
> In fact, singular vectors *are* eigenvectors, in general.  If you're a
> singular vector
> of matrix A, then you're the eigenvector of either A'A, or AA' (depending
> on
> whether you're a left or right eigenvector).
>
>  -jake
>
>
> >
> > On Thu, Aug 25, 2011 at 3:40 PM, Jeff Hansen <dscheffy@gmail.com> wrote:
> >
> > > I've been playing around with this problem for the last week or so (or
> at
> > > least this problem as I understood it based on your initial commentary
> > > Lance) -- but purely in R using smaller data so I can 1. get my head
> > wrapped
> > > around the problem, and 2. get more familiar with R.
> > >
> > > To make the problem a little more tenable I limited my sample to 200
> > movies
> > > and 10,000 users (taking the most rated movies from 2004 and 2005 based
> > on
> > > NF's dataset -- I know, I should really switch back to the grouplens
> > > dataset...)  I'm also only looking at binary data at the moment -- I
> > treat
> > > any rating above 3 as a movie you liked and anything 3 or below as the
> > same
> > > as not having rated the movie.
> > >
> > > So I take this 200 x 10,000 matrix of 1s and 0s and I run a truncated
> SVD
> > > on it so that I can project it onto a 10 dimensional space.
> > >
> > > M<-initial data
> > > s_m<- svd(M,10,10)
> > > U<-s_m$u
> > > S<-diag(s_m$d[1:10])
> > > V<-s_m$v
> > >
> > > So U is a 200 row by 10 column matrix -- each row represents the
> > > eigenvector of a given movie, and each column represents one Lance's so
> > > called axes of interest.  So what I did next was spit out the top and
> > bottom
> > > n movie titles for each of these 10 dimensions. I found it was
> important
> > to
> > > show more than one movie title for each side of the dimensions,
> otherwise
> > > the results might be somewhat misleading.
> > >
> > > I then went through the 10 dimensions and qualitatively  answered for
> > > myself whether I was strongly or weakly aligned in one direction, or
> not
> > > aligned in anyway on this dimension. Personally I usually found I only
> > felt
> > > strongly aligned on 2 of the ten, and weakly aligned on another 2.
> > >
> > > I then normalized U across each of the ten dimensions and for each
> movie
> > > added up it's z score in that dimension by my alignment in that
> > dimension.
> > >  I then sorted the results and displayed the movie titles -- it was a
> > pretty
> > > accurate ranking of movies as I like them.
> > >
> > > scaled <- apply(U,2,scale)
> > > me <- c(0,2,1,0,-1,1,0,0,0,0)
> > > dim(me) <- c(10,1)
> > > recommendations <- scaled %*% me
> > >
> > > I imagine few users would want to bother, but I can see where it would
> be
> > a
> > > relatively quick way to train a recommender.  Here's the problem though
> > -- I
> > > can get it to work using the method I've described above, but I can't
> > quite
> > > figure out how to use it to generate an eigenvector for the user.  For
> > > existing users I can always generate predictions by matrix multiplying
> U
> > %*%
> > > S %*% t(V)[,user] and then sorting by the results.  It would be nice to
> > use
> > > a consistent model.  I can't quite see the math to generate an
> equivalent
> > > equation though.
> > >
> > > On Wed, Aug 17, 2011 at 3:52 AM, Lance Norskog <goksron@gmail.com>
> > wrote:
> > >
> > >> Sharpened:
> > >>
> > >>
> > >>
> >
> http://ultrawhizbang.blogspot.com/2011/08/singular-vectors-for-recommendations.html
> > >>
> > >> On Wed, Aug 10, 2011 at 11:53 PM, Sean Owen <srowen@apache.org>
> wrote:
> > >> > You may need to sharpen your terms / problem statement here :
> > >> >
> > >> > What is a geometric value -- just mean a continuous real value?
> > >> > So these are item-feature vectors?
> > >> >
> > >> > The middle bit of the output of an SVD is not a singular vector --
> > it's
> > >> a
> > >> > diagonal matrix containing singular values on the diagonal.
> > >> > The left matrix contains singular vectors, which are not
> eigenvectors
> > >> except
> > >> > in very specific cases of the original matrix.
> > >> >
> > >> > Singular vectors are the columns of the left matrix, not rows,
> whereas
> > >> items
> > >> > corresponds to its rows. What do you mean about relating them?
> > >> > What do you mean by the "hot spot" you are trying to find?
> > >> > A vector does not express two end-points, no. You could think of
> (X,Y)
> > >> as
> > >> > corresponding to a point in 2-space, or could think of it as a ray
> > from
> > >> > (0,0) to (X,Y), but you could think of it as (100,200) to
> > (100+X,200+Y)
> > >> just
> > >> > as well. There are not two point implied by anything here.
> > >> >
> > >> >
> > >> > How do you get points from the original item-feature space into the
> > >> > transformed, reduced space? While I think this is an imprecise
> answer:
> > >> if A
> > >> > = U Sigma V^T then you can think of (Sigma V^T) as like the
> > >> change-of-basis
> > >> > transformation that does this.
> > >> >
> > >> >
> > >> > On Wed, Aug 10, 2011 at 10:54 AM, Lance Norskog <goksron@gmail.com>
> > >> wrote:
> > >> >
> > >> >> Zeroing in on the topic:
> > >> >>
> > >> >> I have:
> > >> >> 1) a set of raw input vectors of a given length, one for each
item.
> > >> >> Each value in the vectors are geometric, not bag-of-words or other.
> > >> >> The matrix is [# items , # dimensions].
> > >> >> 2) An SVD of same:
> > >> >>    left matrix of [ # items, #d features per item] * singular
> > >> >> vector[# features] * right matrix of [#dimensions features per
> > >> >> dimension, #dimensions].
> > >> >> 3) The first few columns of the left matrix are interesting
> singular
> > >> >> eigenvectors.
> > >> >>
> > >> >> I would like to:
> > >> >> 1) relate the singular vectors to the item vectors, such that
they
> > >> >> create points in the "hot spots" of the item vectors.
> > >> >> 2) find the inverses: a singular vector has two endpoints, and
both
> > >> >> represent "hot spots" in the item space.
> > >> >>
> > >> >> Given the first 3 singular vectors, there are 6 "hot spots" in
the
> > >> >> item vectors, one for each end of the vector. What transforms
are
> > >> >> needed to get the item vectors and the singular vector endpoints
in
> > >> >> the same space? I'm not finding the exact sequence.
> > >> >>
> > >> >> A use case for this is a new user. It gives a quick assessment
by
> > >> >> asking where the user is on the few common axes of items:
> > >> >> "Transformers 3: The Stupiding" v.s. "Crazy Bride Wedding Love
> > >> >> Planner"?
> > >> >>
> > >> >> --
> > >> >> Lance Norskog
> > >> >> goksron@gmail.com
> > >> >>
> > >> >
> > >>
> > >>
> > >>
> > >> --
> > >> Lance Norskog
> > >> goksron@gmail.com
> > >>
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message