hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikhail Yakshin <greycat.na....@gmail.com>
Subject Re: Reduce doesn't start until map finishes
Date Wed, 04 Mar 2009 00:00:45 GMT
On Wed, Mar 4, 2009 at 2:09 AM, Chris Douglas wrote:
> This is normal behavior. The Reducer is guaranteed to receive all the
> results for its partition in sorted order. No reduce can start until all the
> maps are completed, since any running map could emit a result that would
> violate the order for the results it currently has. -C

_Reducers_ usually start almost immediately and start downloading data
emitted by mappers as they go. This is their first phase. Their second
phase can start only after completion of all mappers. In their second
phase, they're sorting received data, and in their third phase they're
doing real reduction.

WBR, Mikhail Yakshin

View raw message