hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jay Vyas <jayunit...@gmail.com>
Subject Re: Hadoop counter
Date Fri, 19 Oct 2012 15:18:42 GMT
Ah this answers alot about why some of my dynamic counters never show up
and i have to bite my nails waiting to see whats going on until the end of
the job- thanks.

Another question: what happens if a task fails ?  What happen to the
counters for it ?  Do they dissappear into the ether? Or do they get merged
in with the counters from other tasks?

On Fri, Oct 19, 2012 at 9:50 AM, Bertrand Dechoux <dechouxb@gmail.com>wrote:

> And by default the number of counters is limited to 120 with the
> mapreduce.job.counters.limit property.
> They are useful for displaying short statistics about a job but should not
> be used for results (imho).
> I know people may misuse them but I haven't tried so I wouldn't be able to
> list the caveats.
> Regards
> Bertrand
> On Fri, Oct 19, 2012 at 4:35 PM, Michael Segel <michael_segel@hotmail.com>wrote:
>> As I understand it... each Task has its own counters and are
>> independently updated. As they report back to the JT, they update the
>> counter(s)' status.
>> The JT then will aggregate them.
>> In terms of performance, Counters take up some memory in the JT so while
>> its OK to use them, if you abuse them, you can run in to issues.
>> As to limits... I guess that will depend on the amount of memory on the
>> JT machine, the size of the cluster (Number of TT) and the number of
>> counters.
>> In terms of global accessibility... Maybe.
>> The reason I say maybe is that I'm not sure by what you mean by globally
>> accessible.
>> If a task creates and implements a dynamic counter... I know that it will
>> eventually be reflected in the JT. However, I do not believe that a
>> separate Task could connect with the JT and see if the counter exists or if
>> it could get a value or even an accurate value since the updates are
>> asynchronous.  Not to mention that I don't believe that the counters are
>> aggregated until the job ends. It would make sense that the JT maintains a
>> unique counter for each task until the tasks complete. (If a task fails, it
>> would have to delete the counters so that when the task is restarted the
>> correct count is maintained. )  Note, I haven't looked at the source code
>> so I am probably wrong.
>> HTH
>> Mike
>> On Oct 19, 2012, at 5:50 AM, Lin Ma <linlma@gmail.com> wrote:
>> Hi guys,
>> I have some quick questions regarding to Hadoop counter,
>>    - Hadoop counter (customer defined) is global accessible (for both
>>    read and write) for all Mappers and Reducers in a job?
>>    - What is the performance and best practices of using Hadoop
>>    counters? I am not sure if using Hadoop counters too heavy, there will be
>>    performance downgrade to the whole job?
>> regards,
>> Lin
> --
> Bertrand Dechoux

Jay Vyas

View raw message