hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Hadoop counter
Date Fri, 19 Oct 2012 16:36:48 GMT
Hi,

There're essentially 4 lists on a live JT: Running, Completed,
Failed/Killed, Retired. The first 3 have counters in memory, the last
has it on disk. Completed and Failed/Killed jobs are sent to Retired
(on-disk persistence, and garbage collected out of memory), after a
default period of 24 hours post-finish time.

On Fri, Oct 19, 2012 at 10:03 PM, Lin Ma <linlma@gmail.com> wrote:
> Hi Harsh,
>
> Thanks for the brilliant reply.
>
> For your comments -- "Yes, they are ultimately stored at JT until the job is
> retired out of
>
> heap memory (in which case, they get stored into the JobHistory
> location and format).", does it mean only running job's counter will consume
> JT memory, for completed job, counter will be stored in disk (I think for
> "JobHistory location and format" is on disk?)?
>
> regards,
> Lin
>
>
> On Sat, Oct 20, 2012 at 12:19 AM, Harsh J <harsh@cloudera.com> wrote:
>>
>> Hi,
>>
>> Inline.
>>
>> On Fri, Oct 19, 2012 at 9:39 PM, Lin Ma <linlma@gmail.com> wrote:
>> > Hi Harsh,
>> >
>> > Thanks for the great reply. Two basic questions,
>> >
>> > - Where the counters' value are stored for successful job? On JT?
>>
>> Yes, they are ultimately stored at JT until the job is retired out of
>> heap memory (in which case, they get stored into the JobHistory
>> location and format).
>>
>> > - Supposing a specific job A completed successfully and updated related
>> > counters, is it possible for another specific job B to read counters
>> > updated
>> > by previous job A? If yes, how?
>>
>> Yes, possible, use the RunningJob object from the previous job (or
>> capture one) and query it. APIs you're interested in:
>>
>> Grab a query-able object (RunningJob and/or a Job):
>>
>> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/JobClient.html#getJob(org.apache.hadoop.mapred.JobID)
>> or
>> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/Cluster.html#getJob(org.apache.hadoop.mapreduce.JobID)
>>
>> Query counters:
>>
>> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/RunningJob.html#getCounters()
>> or
>> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/Job.html#getCounters()
>>
>> > regards,
>> > Lin
>> >
>> >
>> > On Fri, Oct 19, 2012 at 11:50 PM, Harsh J <harsh@cloudera.com> wrote:
>> >>
>> >> Bejoy is almost right, except that counters are reported upon progress
>> >> of tasks itself (via TT heartbeats to JT actually), but the final
>> >> counter representation is computed only with successful task reports
>> >> the job received, not from any failed or killed ones.
>> >>
>> >> On Fri, Oct 19, 2012 at 8:51 PM, Bejoy KS <bejoy.hadoop@gmail.com>
>> >> wrote:
>> >> > Hi Jay
>> >> >
>> >> > Counters are reported at the end of a task to JT. So if a task fails
>> >> > the
>> >> > counters from that task are not send to JT and hence won't be
>> >> > included
>> >> > in
>> >> > the final value of counters from that Job.
>> >> > Regards
>> >> > Bejoy KS
>> >> >
>> >> > Sent from handheld, please excuse typos.
>> >> > ________________________________
>> >> > From: Jay Vyas <jayunit100@gmail.com>
>> >> > Date: Fri, 19 Oct 2012 10:18:42 -0500
>> >> > To: <user@hadoop.apache.org>
>> >> > ReplyTo: user@hadoop.apache.org
>> >> > Subject: Re: Hadoop counter
>> >> >
>> >> > Ah this answers alot about why some of my dynamic counters never show
>> >> > up
>> >> > and
>> >> > i have to bite my nails waiting to see whats going on until the end
>> >> > of
>> >> > the
>> >> > job- thanks.
>> >> >
>> >> > Another question: what happens if a task fails ?  What happen to the
>> >> > counters for it ?  Do they dissappear into the ether? Or do they get
>> >> > merged
>> >> > in with the counters from other tasks?
>> >> >
>> >> > On Fri, Oct 19, 2012 at 9:50 AM, Bertrand Dechoux
>> >> > <dechouxb@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> And by default the number of counters is limited to 120 with the
>> >> >> mapreduce.job.counters.limit property.
>> >> >> They are useful for displaying short statistics about a job but
>> >> >> should
>> >> >> not
>> >> >> be used for results (imho).
>> >> >> I know people may misuse them but I haven't tried so I wouldn't
be
>> >> >> able
>> >> >> to
>> >> >> list the caveats.
>> >> >>
>> >> >> Regards
>> >> >>
>> >> >> Bertrand
>> >> >>
>> >> >>
>> >> >> On Fri, Oct 19, 2012 at 4:35 PM, Michael Segel
>> >> >> <michael_segel@hotmail.com>
>> >> >> wrote:
>> >> >>>
>> >> >>> As I understand it... each Task has its own counters and are
>> >> >>> independently updated. As they report back to the JT, they
update
>> >> >>> the
>> >> >>> counter(s)' status.
>> >> >>> The JT then will aggregate them.
>> >> >>>
>> >> >>> In terms of performance, Counters take up some memory in the
JT so
>> >> >>> while
>> >> >>> its OK to use them, if you abuse them, you can run in to issues.
>> >> >>> As to limits... I guess that will depend on the amount of memory
on
>> >> >>> the
>> >> >>> JT machine, the size of the cluster (Number of TT) and the
number
>> >> >>> of
>> >> >>> counters.
>> >> >>>
>> >> >>> In terms of global accessibility... Maybe.
>> >> >>>
>> >> >>> The reason I say maybe is that I'm not sure by what you mean
by
>> >> >>> globally
>> >> >>> accessible.
>> >> >>> If a task creates and implements a dynamic counter... I know
that
>> >> >>> it
>> >> >>> will
>> >> >>> eventually be reflected in the JT. However, I do not believe
that a
>> >> >>> separate
>> >> >>> Task could connect with the JT and see if the counter exists
or if
>> >> >>> it
>> >> >>> could
>> >> >>> get a value or even an accurate value since the updates are
>> >> >>> asynchronous.
>> >> >>> Not to mention that I don't believe that the counters are
>> >> >>> aggregated
>> >> >>> until
>> >> >>> the job ends. It would make sense that the JT maintains a unique
>> >> >>> counter for
>> >> >>> each task until the tasks complete. (If a task fails, it would
have
>> >> >>> to
>> >> >>> delete the counters so that when the task is restarted the
correct
>> >> >>> count is
>> >> >>> maintained. )  Note, I haven't looked at the source code so
I am
>> >> >>> probably
>> >> >>> wrong.
>> >> >>>
>> >> >>> HTH
>> >> >>> Mike
>> >> >>> On Oct 19, 2012, at 5:50 AM, Lin Ma <linlma@gmail.com>
wrote:
>> >> >>>
>> >> >>> Hi guys,
>> >> >>>
>> >> >>> I have some quick questions regarding to Hadoop counter,
>> >> >>>
>> >> >>> Hadoop counter (customer defined) is global accessible (for
both
>> >> >>> read
>> >> >>> and
>> >> >>> write) for all Mappers and Reducers in a job?
>> >> >>> What is the performance and best practices of using Hadoop
>> >> >>> counters? I
>> >> >>> am
>> >> >>> not sure if using Hadoop counters too heavy, there will be
>> >> >>> performance
>> >> >>> downgrade to the whole job?
>> >> >>>
>> >> >>> regards,
>> >> >>> Lin
>> >> >>>
>> >> >>>
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Bertrand Dechoux
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Jay Vyas
>> >> > http://jayunit100.blogspot.com
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Mime
View raw message