hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jürgen Broß <juergen.br...@fu-berlin.de>
Subject How to let Reducer know on which partition it is working
Date Wed, 26 Nov 2008 12:35:51 GMT
Hi all,

my Reducers need to load a huge HashMap from data present in the HDFS. 
This data has been partitioned by a previous map/reduce job. The 
complete data would not fit into main memory of a Reducer machine.  It 
would suffice to load only the correct partition of the data. The 
problem is that the "correct" partition is determined by the 
Partitioner, which feeds the current Reducers. I'm not sure how to let a 
Reducer know in its configure() method which partition it will get from 
the Partitioner, i.e. which partition to load from HDFS into the HashMap.

Maybe someone has a good idea.


View raw message