hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From abc xyz <fabc_xyz...@yahoo.com>
Subject Total order partitioner
Date Mon, 09 Aug 2010 15:05:39 GMT
The input splits are sampled when we use the total order partitioner. I want to 
know how and when this sampling is done. Is this sampling done before Master 
allocates tasks to the nodes since the sampling file has to be added to 
distributed cache as well. If it is so, is this sampling carried out at master 
node? Then master has to access the input splits for getting the samples?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message