hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jian yi <eyj...@gmail.com>
Subject MBR model diagram (Map-Balance-Reduce)
Date Sat, 06 Feb 2010 14:24:45 GMT
In MR (Map-Reduce) model, reducings are not balanced, because the scale of
partitiones are unbalanced. How to balance? We can control the size of
partition, rehash the bigger parition and combine to the specified size. If
a key has many values, it's necessary to execute mapreduce twice.The
following is the model digram:

[image: Map-Balance-Reduce.JPG]

Scheduler can regard a task as a timeslice similarly OS scheduler.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message