hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Re: does counters go the performance down seriously?
Date Tue, 29 Mar 2011 15:43:03 GMT
On 29/03/11 16:30, Michael Segel wrote:

> Grid Pattern: Applications should not use more than 10, 15 or 25 custom counters."
> I have to question the limitation. It seems arbitrary.
> I agree that counters add additional overhead, but suppose I wanted to run the word count
m/r as a map only job and use counters as a way to capture a count per word?
> At what point does the cost of the counter(s) exceed the cost of the reduce job?

It's not a performance issue, it's total JT memory. Too many counters, 
your JT goes OOM, cluster restart time, all outstanding jobs get to 
restart, etc, etc.

The cost of a large cluster outage is greater than the cost of the 
reduce job.

On a small (not yahoo! size) cluster, if your JT process has enough 
memory, you can have more counters as there is less work to lose, and 
more memory to spare in the JT

View raw message