hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jamal sasha <jamalsha...@gmail.com>
Subject Difference between combiner and aggregator
Date Fri, 05 Apr 2013 20:30:17 GMT
Hi,
 I am trying to understand the difference between combiner and aggregator.

Based on my readings:
Wordcount example (mapper)

aggregator
class Mapper
  method MAP
  H <-- Associative array
  for all term t in document:
      H{t} = H{t} + 1
  for all term t ele H do
      EMIT(term t, count H{t})


combiner:
class Mapper
 method INITIALIZE
  H <-- Associative array
  method MAP
  for all term t in document:
      H{t} = H{t} + 1
 method CLOSE
  for all term t ele H do
      EMIT(term t, count H{t})

So, second method is how combiner is implemented.
But 1 seems much simpler?
What are the gains I get using combiner instead of local aggregations?

Mime
View raw message