hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Parallell maps
Date Fri, 03 Jul 2009 23:16:05 GMT
That doesn't actually speed things up.  Generally, in fact, it slows things

This is a case of sequential update.  Batch update converges more slowly in
terms of the total number of operations, but because of the economies
available in map-reduce programs (due to sequential reading, merge sorting,
shared nothing and so on), the convergence in terms of time is considerably
faster, especially if you hold total hardware constant.

This is *exactly* the same point that I have been making.  You should *not*
be doing random access during a page rank computation (not if you want high

On Fri, Jul 3, 2009 at 2:13 PM, Marcus Herou <marcus.herou@tailsweep.com>wrote:

> We use memcached during PR calculations to store the node's temporary score
> so whenever you calculate the score for another node which is dependent on
> the node in question you can access the previously calculated scores. Was
> that explicit or just confusing ? Trying once more...

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message