hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Lewi <jer...@lewi.us>
Subject Re: Counters that track the max value
Date Fri, 05 Oct 2012 17:40:50 GMT
Done.
https://issues.apache.org/jira/browse/MAPREDUCE-4709

Thanks
J

On Fri, Oct 5, 2012 at 10:13 AM, Harsh J <harsh@cloudera.com> wrote:

> Jeremy,
>
> I suppose thats doable, please file a MAPREDUCE JIRA so you can
> discuss this with others on the development side as well.
>
> I am guessing that MAX operations of most of the user-oriented data
> flow front-ends such as Hive and Pig already do this efficiently, so
> perhaps there hasn't been a very strong need for this.
>
> On Fri, Oct 5, 2012 at 9:18 PM, Jeremy Lewi <jeremy@lewi.us> wrote:
> > HI Harsh,
> >
> > Thank you very much that will work.
> >
> > How come we can't simply create a modification of a regular mapreduce
> > counter which does this behind the scenes? It seems like we should just
> be
> > able to replace "+" with "max" and everything else should work?
> >
> > J
> >
> >
> > On Wed, Oct 3, 2012 at 9:52 AM, Harsh J <harsh@cloudera.com> wrote:
> >>
> >> Jeremy,
> >>
> >> Here's my shot at it (pardon the quick crappy code):
> >> https://gist.github.com/3828246
> >>
> >> Basically - you can achieve it in two ways:
> >>
> >> Requirement:  All tasks must increment the "max" designated counter
> >> only AFTER the max has been computed (i.e. in cleanup).
> >>
> >> 1. All tasks may use same counter name. Later, we pull per-task
> >> counters and determine the max at the client. (This is my quick and
> >> dirty implementation)
> >> 2. All tasks may use their own task ID (Number part) in the counter
> >> name, but use the same group. Later, we fetch all counters for that
> >> group and iterate over it to find the max. This is cleaner, and
> >> doesn't end up using deprecated APIs such as the above.
> >>
> >> Does this help?
> >>
> >> On Wed, Oct 3, 2012 at 8:47 PM, Jeremy Lewi <jeremy@lewi.us> wrote:
> >> > HI hadoop-users,
> >> >
> >> > I'm curious if there is an implementation somewhere of a counter which
> >> > tracks the maximum of some value across all mappers or reducers?
> >> >
> >> > Thanks
> >> > J
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>

Mime
View raw message