hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Owen O'Malley <omal...@apache.org>
Subject Re: Question about distributed sort
Date Sun, 24 Aug 2008 23:21:36 GMT

On Aug 22, 2008, at 4:00 PM, Alex Holmes wrote:

> For a given input key, K, in a reduce task, does Hadoop guarantee that
> all mapper-emitted values for key K are available in the iterator?  Is
> it possible that multiple reduce tasks can receive the same key?

All key value pairs out of all of the mappers that contain a given key  
K will be sent to the same reduce. The reduce is chosen by the  
partitioner, which can be specified by the application. The default  
partitioner does (key.hashCode() & Integer.MAX_VALUE) % numReduces.

-- Owen

View raw message