chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guillermo Pérez <>
Subject Re: Launching different record & reducers from mapper
Date Tue, 09 Mar 2010 06:56:52 GMT
On Mon, Mar 8, 2010 at 20:15, Eric Yang <> wrote:
> It doesn't look like you are splitting record in the mapper phase to reducer
> type ActionLogAggregateWeights.  The current demux is partitioned by the
> reducer record type.  Hence, if the record is split in the reducer phase, it
> will not work.  Take a look at Top mapper class.  It is calling
> buildGenericRecord to partition reducer type.  ActionLog mapper should
> mirror the data and send to both ActionLog and ActionLogAggregateWeights
> reducer class.  Hope this helps.

I think I'm doing that. In the mapper I prepare two records, two keys,
and I set a different key.setReduceType(). One uses the default
identity, and the other a special redux class that combines records to
generate aggregates.

> Note, Reducer partition by RecordType is not correctly implemented in the
> current demux.  Chukwa requires single reducer per data type to run
> correctly.  If a single record type generates large amount of data, the
> reducer for the large record type become the bottle neck of demux.  Hence,
> Demux is going to change when Avro Input/Output format is ready.  I am not
> sure if it may impact your implementation but something to keep in mind.

I'm just generating two records out of each record I map. One for just
log it, and the other just for aggregation, including more fields in
the key and just a counter in the record itself.

Guille -ℬḭṩḩø- <>

View raw message