hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tigertail <tyc...@yahoo.com>
Subject Percentage calculation?
Date Mon, 17 Aug 2009 15:22:19 GMT

Hi Hadoop/MapReduce experts,

My question might be naive, But I am really stuck here and I am looking
forward to get helps/advises from you.

I have an input file like
key1, 2
key2, 1
key1, 1
key3, 1

It is easy to write a M/R code to calculate the count for each key and
output sth like
key1, 3
key2, 1
key3, 1

But, how I can calculate the percentage of each key over all keys, with the
above input, I would expect to get the output as
key1, 0.60
key2, 0.20
key3, 0.20

One naive method is to calculate the total count (5 with the above input)
which is saved in a file. Then the file is read in before M/R starts. But it
is obviously ugly and slow. 

I also tried to set a static enum Counters { INPUT_WORDS }
In mapper I do context.getCounter(Counters.INPUT_WORDS).increment(1);
In reducer I do context.getCounter(Counters.INPUT_WORDS).getCounter();
But it does not work.

Is there more elegant way?
-- 
View this message in context: http://www.nabble.com/Percentage-calculation--tp25008761p25008761.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.


Mime
View raw message