avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Carey <scottca...@apache.org>
Subject Re: Avro Map-Reduce and ChainMapper
Date Wed, 08 Feb 2012 18:04:44 GMT
I have not tried or tested ChainMapper with Avro myself.  It will probably
work if you configure the input schemas or output schemas appropriately.
Take a look at what AvroJog.setInputSchema is doing, if you are familiar
enough with hadoop's configuration you may be able to work it out.  Others
likely know more than I do on this.

Also, you may be interested in how things are done in this variation:
https://github.com/wibidata/odiago-avro


On 2/1/12 8:23 AM, "Andrew Kenworthy" <adwkenworthy@yahoo.com> wrote:

> Hallo,
> 
> Is it possible to chain Avro MR jobs using the ChainMapper? I'm looking to
> chain two map tasks and a reducer, but haven't been able to find any examples:
> 
> Chain summary:
> a) first map task: takes non-avro input and produces K/V output in the form of
> AvroKey(Record), NullWritable
> b) second map task: taking output of first task as its input [mapper extends
> AvroMapper(Record, Pair(Record, NullWritable))]
> c) reducer: AvroReducer
> 
> In particular, how would I specify the input and output schemas - simply
> calling AvroJob.setInputSchema/setOutputSchema on the individual chained job
> conf objects?
> 
> Thanks,
> 
> Andrew



Mime
View raw message