hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lin Ma <lin...@gmail.com>
Subject Re: Hadoop counter
Date Fri, 19 Oct 2012 16:09:56 GMT
Hi Harsh,

Thanks for the great reply. Two basic questions,

- Where the counters' value are stored for successful job? On JT?
- Supposing a specific job A completed successfully and updated related
counters, is it possible for another specific job B to read counters
updated by previous job A? If yes, how?

regards,
Lin

On Fri, Oct 19, 2012 at 11:50 PM, Harsh J <harsh@cloudera.com> wrote:

> Bejoy is almost right, except that counters are reported upon progress
> of tasks itself (via TT heartbeats to JT actually), but the final
> counter representation is computed only with successful task reports
> the job received, not from any failed or killed ones.
>
> On Fri, Oct 19, 2012 at 8:51 PM, Bejoy KS <bejoy.hadoop@gmail.com> wrote:
> > Hi Jay
> >
> > Counters are reported at the end of a task to JT. So if a task fails the
> > counters from that task are not send to JT and hence won't be included in
> > the final value of counters from that Job.
> > Regards
> > Bejoy KS
> >
> > Sent from handheld, please excuse typos.
> > ________________________________
> > From: Jay Vyas <jayunit100@gmail.com>
> > Date: Fri, 19 Oct 2012 10:18:42 -0500
> > To: <user@hadoop.apache.org>
> > ReplyTo: user@hadoop.apache.org
> > Subject: Re: Hadoop counter
> >
> > Ah this answers alot about why some of my dynamic counters never show up
> and
> > i have to bite my nails waiting to see whats going on until the end of
> the
> > job- thanks.
> >
> > Another question: what happens if a task fails ?  What happen to the
> > counters for it ?  Do they dissappear into the ether? Or do they get
> merged
> > in with the counters from other tasks?
> >
> > On Fri, Oct 19, 2012 at 9:50 AM, Bertrand Dechoux <dechouxb@gmail.com>
> > wrote:
> >>
> >> And by default the number of counters is limited to 120 with the
> >> mapreduce.job.counters.limit property.
> >> They are useful for displaying short statistics about a job but should
> not
> >> be used for results (imho).
> >> I know people may misuse them but I haven't tried so I wouldn't be able
> to
> >> list the caveats.
> >>
> >> Regards
> >>
> >> Bertrand
> >>
> >>
> >> On Fri, Oct 19, 2012 at 4:35 PM, Michael Segel <
> michael_segel@hotmail.com>
> >> wrote:
> >>>
> >>> As I understand it... each Task has its own counters and are
> >>> independently updated. As they report back to the JT, they update the
> >>> counter(s)' status.
> >>> The JT then will aggregate them.
> >>>
> >>> In terms of performance, Counters take up some memory in the JT so
> while
> >>> its OK to use them, if you abuse them, you can run in to issues.
> >>> As to limits... I guess that will depend on the amount of memory on the
> >>> JT machine, the size of the cluster (Number of TT) and the number of
> >>> counters.
> >>>
> >>> In terms of global accessibility... Maybe.
> >>>
> >>> The reason I say maybe is that I'm not sure by what you mean by
> globally
> >>> accessible.
> >>> If a task creates and implements a dynamic counter... I know that it
> will
> >>> eventually be reflected in the JT. However, I do not believe that a
> separate
> >>> Task could connect with the JT and see if the counter exists or if it
> could
> >>> get a value or even an accurate value since the updates are
> asynchronous.
> >>> Not to mention that I don't believe that the counters are aggregated
> until
> >>> the job ends. It would make sense that the JT maintains a unique
> counter for
> >>> each task until the tasks complete. (If a task fails, it would have to
> >>> delete the counters so that when the task is restarted the correct
> count is
> >>> maintained. )  Note, I haven't looked at the source code so I am
> probably
> >>> wrong.
> >>>
> >>> HTH
> >>> Mike
> >>> On Oct 19, 2012, at 5:50 AM, Lin Ma <linlma@gmail.com> wrote:
> >>>
> >>> Hi guys,
> >>>
> >>> I have some quick questions regarding to Hadoop counter,
> >>>
> >>> Hadoop counter (customer defined) is global accessible (for both read
> and
> >>> write) for all Mappers and Reducers in a job?
> >>> What is the performance and best practices of using Hadoop counters? I
> am
> >>> not sure if using Hadoop counters too heavy, there will be performance
> >>> downgrade to the whole job?
> >>>
> >>> regards,
> >>> Lin
> >>>
> >>>
> >>
> >>
> >>
> >> --
> >> Bertrand Dechoux
> >
> >
> >
> >
> > --
> > Jay Vyas
> > http://jayunit100.blogspot.com
>
>
>
> --
> Harsh J
>

Mime
View raw message