avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <qwertyman...@gmail.com>
Subject Re: Hi,all. How can I involve two avro files with different schema into one M/R job?
Date Fri, 18 Mar 2011 18:31:45 GMT
On Fri, Mar 18, 2011 at 11:38 PM, Doug Cutting <cutting@apache.org> wrote:
> Is that what you're after?  Why would you need this?

Probably a small case, in which I would require reading from multiple
sources in my job (perhaps even process them differently until the Map
phase), with special reader-schemas for each of my sources.

This could be custom-built easily, but I just wondered if general
use-cases of avro datafiles could benefit from such a thing.

Right now AvroJob.setInputSchema(...) sets given schema as
"avro.input.schema" in the Job, and my suggestion was to make it
something like /path/1+avro.input.schema, /path/2+avro.input.schema so
that each instantiated record reader for mappers (via MultipleInputs)
can pick up its own special reader schema (since they get a /path/2
via FileSplit).

-- 
Harsh J
http://harshj.com

Mime
View raw message