Hi,
there are two solutions suggested that take advantage of either (a) a
vector x matrix (your CF / Mahout example ) or (b) a small matrix x large
matrix (an earlier suggestion of putting the small matrix into the
Distributed Cache). Not clear yet on good approaches of (c) large matrix
x large matrix.
2011/11/19 <bejoy.hadoop@gmail.com>
> Hey Mike
> In mahout one place where matrix multiplication is used is in
> Collaborative Filtering distributed implementation. The recommendations
> here are generated by the multiplication of a cooccurence matrix with a
> user vector. This user vector is treated as a single column matrix and then
> the matrix multiplication takes place in there.
>
> Regards
> Bejoy K S
>
> Original Message
> From: Mike Spreitzer <mspreitz@us.ibm.com>
> Date: Fri, 18 Nov 2011 14:52:05
> To: <commonuser@hadoop.apache.org>
> ReplyTo: commonuser@hadoop.apache.org
> Subject: RE: Matrix multiplication in Hadoop
>
> Well, this mismatch may tell me something interesting about Hadoop. Matrix
> multiplication has a lot of inherent parallelism, so from very crude
> considerations it is not obvious that there should be a mismatch. Why is
> matrix multiplication illsuited for Hadoop?
>
> BTW, I looked into the Mahout documentation some, and did not find matrix
> multiplication there. It might be hidden inside one of the advertised
> algorithms; I looked at the documentation for a few, but did not notice
> mention of MM.
>
> Thanks,
> Mike
>
>
>
> From: Michael Segel <michael_segel@hotmail.com>
> To: <commonuser@hadoop.apache.org>
> Date: 11/18/2011 01:49 PM
> Subject: RE: Matrix multiplication in Hadoop
>
>
>
>
> Ok Mike,
>
> First I admire that you are studying Hadoop.
>
> To answer your question... not well.
>
> Might I suggest that if you want to learn Hadoop, you try and find a
> problem which can easily be broken in to a series of parallel tasks where
> there is minimal communication requirements between each task?
>
> No offense, but if I could make a parallel... what you're asking is akin
> to taking a normalized relational model and trying to run it as is in
> HBase.
> Yes it can be done. But not the best use of resources.
>
> > To: commonuser@hadoop.apache.org
> > CC: commonuser@hadoop.apache.org
> > Subject: Re: Matrix multiplication in Hadoop
> > From: mspreitz@us.ibm.com
> > Date: Fri, 18 Nov 2011 12:39:00 0500
> >
> > That's also an interesting question, but right now I am studying Hadoop
> > and want to know how well dense MM can be done in Hadoop.
> >
> > Thanks,
> > Mike
> >
> >
> >
> > From: Michel Segel <michael_segel@hotmail.com>
> > To: "commonuser@hadoop.apache.org" <commonuser@hadoop.apache.org>
> > Date: 11/18/2011 12:34 PM
> > Subject: Re: Matrix multiplication in Hadoop
> >
> >
> >
> > Is Hadoop the best tool for doing large matrix math.
> > Sure you can do it, but, aren't there better tools for these types of
> > problems?
> >
> >
> > Sent from a remote device. Please excuse any typos...
> >
> > Mike Segel
> >
>
>
>
