hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From George Stathis <gstat...@gmail.com>
Subject Re: Counter availability from Mapper to Reducer
Date Wed, 21 Apr 2010 13:17:38 GMT
It turns out of course that this was a fairly basic question that I was able
to answer for myself with a little more "RTFM". My understanding of how
Counters really work in the MapReduce workflow was wrong.

For other beginners like me, if you ever wonder the same thing, Counters are
not frequently updated from cluster node to cluster node and are only truly
finalized at the end of a job; so they can never be used to pass status
values from Mapper to Reducer.

Chapter 6 of Tom White's Hadoop Definitive Guide does a good job describing
the workflow:

http://books.google.com/books?id=bKPEwR-Pt6EC&lpg=PP1&dq=hadoop&pg=PA158#v=onepage&q=%22Counters%20are%20sent%20less%20frequently%22&f=false

I definitely recommend this book.

-GS

On Mon, Apr 19, 2010 at 11:07 PM, George Stathis <gstathis@gmail.com> wrote:

> Hello folks,
>
> Possible newbie question here: from
> http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/Counter.html
:
> "Counters represent global counters, defined either by the Map-Reduce
> framework or applications". I take this to mean that counters incremented
> during a Mapper process should be available to a Reducer phase. Is this
> correct? If so, I have a counter that I set during my map stage:
>
> [...]
> context.getCounter("FOO","bar").increment(1);
> [...]
>
> and then attempt to retrieve in the reduce phase:
>
> [...]
> long barValue = context.getCounter("FOO","bar").getValue();
> [...]
>
> Inside the reducer, the counter appears to be still at zero. But in the
> final Job report output, it appears incremented as expected:
>
> [...]
> 10/04/19 19:42:20 INFO mapred.JobClient: Counters: 17
> 10/04/19 19:42:20 INFO mapred.JobClient:   Job Counters
> 10/04/19 19:42:20 INFO mapred.JobClient:     Launched reduce tasks=1
> 10/04/19 19:42:20 INFO mapred.JobClient:     Launched map tasks=1
> 10/04/19 19:42:20 INFO mapred.JobClient:     Data-local map tasks=1
> 10/04/19 19:42:20 INFO mapred.JobClient:   FileSystemCounters
> 10/04/19 19:42:20 INFO mapred.JobClient:     FILE_BYTES_READ=904
> 10/04/19 19:42:20 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=1836
> 10/04/19 19:42:20 INFO mapred.JobClient:   FOO
> 10/04/19 19:42:20 INFO mapred.JobClient:     bar =30
> [...]
>
> Am I just not understanding how Counters really work? Any help is
> appreciated. Thank you in advance for your feedback.
>
> -GS
>

Mime
View raw message