Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 03F60D389 for ; Fri, 5 Oct 2012 15:49:30 +0000 (UTC) Received: (qmail 66042 invoked by uid 500); 5 Oct 2012 15:49:25 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 65965 invoked by uid 500); 5 Oct 2012 15:49:25 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 65958 invoked by uid 99); 5 Oct 2012 15:49:25 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Oct 2012 15:49:25 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: 209.85.223.176 is neither permitted nor denied by domain of jeremy@lewi.us) Received: from [209.85.223.176] (HELO mail-ie0-f176.google.com) (209.85.223.176) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Oct 2012 15:49:20 +0000 Received: by mail-ie0-f176.google.com with SMTP id k11so4897065iea.35 for ; Fri, 05 Oct 2012 08:48:58 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:in-reply-to:references:date :message-id:subject:from:to:content-type:x-gm-message-state; bh=xDQvciE8n6jNTCV9y/o5p+UE2THW7OUw51i2k/t0PPk=; b=VZmToKEP7jYYW2XLFTNw9npv1bVbMcEJIeAkHylkHGAcZ3LvvazonEl8lIK4/bHK42 +cizHTa6gVhp2xnGcPOd5luzm8wf8e7kAe4/Wj0fRD98uqQGcdoFDIfQvxL3wvDILG0p hVbgS768FEVff5jvsrh0tWnqNCczpUWei0UqVKBCV/BnopFM5DkIPOQ7fzLuShL3C8su nHWgSLRRmIOl1R6WHqFkDw6XiV720oQHYmANTWTpzr7KP3tiZNHbRZHODDLr/a6BJWTl JUkfb/xAiieonfgYDdt1PuMEmQAt2S3J8iXherwW3r4bMZLxqOaN9oR2IHQE5h2YN+uP fV9A== MIME-Version: 1.0 Received: by 10.50.51.225 with SMTP id n1mr1590586igo.7.1349452138839; Fri, 05 Oct 2012 08:48:58 -0700 (PDT) Received: by 10.64.128.68 with HTTP; Fri, 5 Oct 2012 08:48:58 -0700 (PDT) X-Originating-IP: [74.95.13.126] In-Reply-To: References: Date: Fri, 5 Oct 2012 08:48:58 -0700 Message-ID: Subject: Re: Counters that track the max value From: Jeremy Lewi To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=14dae93408cf607e5304cb51cf0d X-Gm-Message-State: ALoCoQk7zDsOHYB3tQyvCV1YKp3pd+AI7io42aitic0/9lph+0M6HkuT5awRWKsw5crAMhcs+BGk X-Virus-Checked: Checked by ClamAV on apache.org --14dae93408cf607e5304cb51cf0d Content-Type: text/plain; charset=ISO-8859-1 HI Harsh, Thank you very much that will work. How come we can't simply create a modification of a regular mapreduce counter which does this behind the scenes? It seems like we should just be able to replace "+" with "max" and everything else should work? J On Wed, Oct 3, 2012 at 9:52 AM, Harsh J wrote: > Jeremy, > > Here's my shot at it (pardon the quick crappy code): > https://gist.github.com/3828246 > > Basically - you can achieve it in two ways: > > Requirement: All tasks must increment the "max" designated counter > only AFTER the max has been computed (i.e. in cleanup). > > 1. All tasks may use same counter name. Later, we pull per-task > counters and determine the max at the client. (This is my quick and > dirty implementation) > 2. All tasks may use their own task ID (Number part) in the counter > name, but use the same group. Later, we fetch all counters for that > group and iterate over it to find the max. This is cleaner, and > doesn't end up using deprecated APIs such as the above. > > Does this help? > > On Wed, Oct 3, 2012 at 8:47 PM, Jeremy Lewi wrote: > > HI hadoop-users, > > > > I'm curious if there is an implementation somewhere of a counter which > > tracks the maximum of some value across all mappers or reducers? > > > > Thanks > > J > > > > -- > Harsh J > --14dae93408cf607e5304cb51cf0d Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable HI Harsh,

Thank you very much that will work.=A0

How come we can't simply create a modification of a r= egular mapreduce counter which does this behind the scenes? It seems like w= e should just be able to replace "+" with "max" and eve= rything else should work?

J

On Wed, Oct 3, 2012= at 9:52 AM, Harsh J <harsh@cloudera.com> wrote:
Jeremy,

Here's my shot at it (pardon the quick crappy code):
https://gist.= github.com/3828246

Basically - you can achieve it in two ways:

Requirement: =A0All tasks must increment the "max" designated cou= nter
only AFTER the max has been computed (i.e. in cleanup).

1. All tasks may use same counter name. Later, we pull per-task
counters and determine the max at the client. (This is my quick and
dirty implementation)
2. All tasks may use their own task ID (Number part) in the counter
name, but use the same group. Later, we fetch all counters for that
group and iterate over it to find the max. This is cleaner, and
doesn't end up using deprecated APIs such as the above.

Does this help?

On Wed, Oct 3, 2012 at 8:47 PM, Jeremy Lewi <jeremy@lewi.us> wrote:
> HI hadoop-users,
>
> I'm curious if there is an implementation somewhere of a counter w= hich
> tracks the maximum of some value across all mappers or reducers?
>
> Thanks
> J



--
Harsh J

--14dae93408cf607e5304cb51cf0d--