Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 79710 invoked from network); 17 Aug 2009 19:48:11 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 17 Aug 2009 19:48:11 -0000 Received: (qmail 62468 invoked by uid 500); 17 Aug 2009 19:48:15 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 62365 invoked by uid 500); 17 Aug 2009 19:48:15 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 62355 invoked by uid 500); 17 Aug 2009 19:48:15 -0000 Delivered-To: apmail-hadoop-core-user@hadoop.apache.org Received: (qmail 62352 invoked by uid 99); 17 Aug 2009 19:48:15 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Aug 2009 19:48:15 +0000 X-ASF-Spam-Status: No, hits=1.4 required=10.0 tests=FORGED_YAHOO_RCVD,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of lists@nabble.com designates 216.139.236.158 as permitted sender) Received: from [216.139.236.158] (HELO kuber.nabble.com) (216.139.236.158) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Aug 2009 19:48:01 +0000 Received: from isper.nabble.com ([192.168.236.156]) by kuber.nabble.com with esmtp (Exim 4.63) (envelope-from ) id 1Md8BB-0004uq-3W for core-user@hadoop.apache.org; Mon, 17 Aug 2009 12:47:41 -0700 Message-ID: <25013023.post@talk.nabble.com> Date: Mon, 17 Aug 2009 12:47:41 -0700 (PDT) From: tigertail To: core-user@hadoop.apache.org Subject: Re: Percentage calculation? In-Reply-To: <25008761.post@talk.nabble.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Nabble-From: tyczjs@yahoo.com References: <25008761.post@talk.nabble.com> X-Virus-Checked: Checked by ClamAV on apache.org Can sb help please? I would expect there must be some easy way to do that. Some corrections, In reducer I do context.getCounter(Counters.INPUT_WORDS).getValue(); But it does not work. it always returns 0. tigertail wrote: > > Hi Hadoop/MapReduce experts, > > My question might be naive, But I am really stuck here and I am looking > forward to get helps/advises from you. > > I have an input file like > key1, 2 > key2, 1 > key1, 1 > key3, 1 > > It is easy to write a M/R code to calculate the count for each key and > output sth like > key1, 3 > key2, 1 > key3, 1 > > But, how I can calculate the percentage of each key over all keys, with > the above input, I would expect to get the output as > key1, 0.60 > key2, 0.20 > key3, 0.20 > > One naive method is to calculate the total count (5 with the above input) > which is saved in a file. Then the file is read in before M/R starts. But > it is obviously ugly and slow. > > I also tried to set a static enum Counters { INPUT_WORDS } > In mapper I do context.getCounter(Counters.INPUT_WORDS).increment(1); > In reducer I do context.getCounter(Counters.INPUT_WORDS).getValue(); > But it does not work. it always returns 0. > > Is there more elegant way? > -- View this message in context: http://www.nabble.com/Percentage-calculation--tp25008761p25013023.html Sent from the Hadoop core-user mailing list archive at Nabble.com.