hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Re: Parallell maps
Date Fri, 03 Jul 2009 14:39:58 GMT
Mark Kerzner wrote:
> That's awesome information, Marcus.
> I am working on a project which would require a similar architectural
> solution (although unlike you I can't broadcast the details), so that was
> very useful. One thing I can say though is that mine is in no way a
> competitor, being in a different area.
> 
> If I could find out more - would be even better. For example, how do you do
> Page Rank. Although I think that I have seen PageRank algorithm in MR
> somewhere (Google actually playfully revealing the secret), and surely
> Pregel promises this code in 15 lines.
> 

Paolo's ranking code is now checked in to our public SVN repository:

http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/extras/citerank/


* It may say LGPL on it, but we do plan to shortly do a bulk switch of
the entire code from that license to Apache, I'm just keeping every
header consistent until then.

* the build file is standalone.xml; builds against hadoop 18.4. the 
build.xml
and ivy file, build it against my branch of 0.21, which is currently 
fixed to
  SVN_TRUNK of jun 12-13, pre-fork
http://svn.apache.org/viewvc/hadoop/core/branches/HADOOP-3628-2/

* The code is interesting as its fairly CPU-intensive for not much
data; the opposite of terasort. I'm thinking of ways to do
power/performance tests on some machines to see what kind of box is
most energy efficient at doing the work.


-- 
Steve Loughran                  http://www.1060.org/blogxter/publish/5
Author: Ant in Action           http://antbook.org/

Mime
View raw message