hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravi Prakash <ravi...@ymail.com>
Subject Re: Yarn never use TeraSort#TotalOrderPartitioner when run TeraSort job?
Date Fri, 18 Oct 2013 15:05:53 GMT
Sam, I would guess that the jar file you think is running, is not actually the one. I am guessing
that in the task classpath, there is a normal jar file (without your changes) which is being
picked up before your modified jar file.





On Thursday, October 17, 2013 10:13 PM, sam liu <samliuhadoop@gmail.com> wrote:
 
It's really weird and confusing me. Anyone can help this question? 

Thanks!




2013/10/16 sam liu <samliuhadoop@gmail.com>

Hi Experts,
>
>In Hadoop-2.0.4, the TeraSort leverage TeraSort#TotalOrderPartitioner as its Partitioner:
'job.setPartitionerClass(TotalOrderPartitioner.class);'. However, seems Yarn did not execute
the methods of TeraSort#TotalOrderPartitioner at all. I did some tests to verify it as below:
>
>Test 1: Add some code in the method readPartitions() and setConf() in TeraSort#TotalOrderPartitioner
to print some words and write some word to a file.
>Expected Result: Some words should be printed and wrote into a file
>Actual Result: No word was printed and wrote into a file at all
>
>Test 2: Remove all existing methods in TeraSort#TotalOrderPartitioner, but only remaining
some necessary but empty methods in it
>
Expected Result: TeraSort job will ocurr some exception, as the specified Partitioner is not
implemented at all
>Actual Result: TeraSort job completed successfully without any exception
>
>Above tests confused me a lot, because seems Yarn never use specified partitioner TeraSort#TotalOrderPartitioner
at all during job execution. 
>
>Any one can help provide the reasons?
>
>Thanks very much!
>
Mime
View raw message