hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephen Boesch <java...@gmail.com>
Subject Re: Matrix multiplication in Hadoop
Date Sat, 19 Nov 2011 23:07:48 GMT
Hi,
   there are two solutions suggested that take advantage of either (a) a
vector x matrix (your CF / Mahout example )  or (b) a small matrix x large
matrix (an earlier suggestion of putting the  small matrix into the
Distributed Cache).  Not clear yet on good approaches of (c)  large matrix
x large matrix.


2011/11/19 <bejoy.hadoop@gmail.com>

> Hey Mike
>          In mahout one place where   matrix multiplication is used is in
>  Collaborative Filtering distributed implementation. The recommendations
> here are generated by the multiplication of a cooccurence matrix with a
> user vector. This user vector is treated as a single column matrix and then
> the matrix multiplication takes place in there.
>
> Regards
> Bejoy K S
>
> -----Original Message-----
> From: Mike Spreitzer <mspreitz@us.ibm.com>
> Date: Fri, 18 Nov 2011 14:52:05
> To: <common-user@hadoop.apache.org>
> Reply-To: common-user@hadoop.apache.org
> Subject: RE: Matrix multiplication in Hadoop
>
> Well, this mismatch may tell me something interesting about Hadoop. Matrix
> multiplication has a lot of inherent parallelism, so from very crude
> considerations it is not obvious that there should be a mismatch.  Why is
> matrix multiplication ill-suited for Hadoop?
>
> BTW, I looked into the Mahout documentation some, and did not find matrix
> multiplication there.  It might be hidden inside one of the advertised
> algorithms; I looked at the documentation for a few, but did not notice
> mention of MM.
>
> Thanks,
> Mike
>
>
>
> From:   Michael Segel <michael_segel@hotmail.com>
> To:     <common-user@hadoop.apache.org>
> Date:   11/18/2011 01:49 PM
> Subject:        RE: Matrix multiplication in Hadoop
>
>
>
>
> Ok Mike,
>
> First I admire that you are studying Hadoop.
>
> To answer your question... not well.
>
> Might I suggest that if you want to learn Hadoop, you try and find a
> problem which can easily be broken in to a series of parallel tasks where
> there is minimal communication requirements between each task?
>
> No offense, but if I could make a parallel... what you're asking is akin
> to taking a normalized relational model and trying to run it as is in
> HBase.
> Yes it can be done. But not the best use of resources.
>
> > To: common-user@hadoop.apache.org
> > CC: common-user@hadoop.apache.org
> > Subject: Re: Matrix multiplication in Hadoop
> > From: mspreitz@us.ibm.com
> > Date: Fri, 18 Nov 2011 12:39:00 -0500
> >
> > That's also an interesting question, but right now I am studying Hadoop
> > and want to know how well dense MM can be done in Hadoop.
> >
> > Thanks,
> > Mike
> >
> >
> >
> > From:   Michel Segel <michael_segel@hotmail.com>
> > To:     "common-user@hadoop.apache.org" <common-user@hadoop.apache.org>
> > Date:   11/18/2011 12:34 PM
> > Subject:        Re: Matrix multiplication in Hadoop
> >
> >
> >
> > Is Hadoop the best tool for doing large matrix math.
> > Sure you can do it, but, aren't there better tools for these types of
> > problems?
> >
> >
> > Sent from a remote device. Please excuse any typos...
> >
> > Mike Segel
> >
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message