avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Carey <scottca...@apache.org>
Subject Re: Multiple Input for Avro jobs
Date Wed, 08 Feb 2012 21:26:06 GMT
If you are after only multiple paths, path globs work.
For example to read both /logs/2012_01  and /logs/2012_02 use the glob path:
/logs/2012_0{1,2}

And to read the four paths /logs/2011_01, /logs/2011_02/, logs/2012_01,  and
/logs/2012_02
/logs/201{1,2}_0{1,2}

'*' is a wildcard as well, e.g. /logs/2011_*/


If you  need a mapper instance per directory or different split assignment
there would be more work involved.

On 2/8/12 12:24 PM, "Serge Blazhievsky" <easyvoip@gmail.com> wrote:

> Hi all,
> 
> I am trying to assign different mapper to different folders.
> 
> Is there an equivalent of Multiinputs for avro
> 
> 
>   MultipleInputs.addInputPath(job, new Path(input),
> AvroInputFormat<GenericRecord>.class, MapImpl.class);
> 
> 
> Thanks
> Serge
>      



Mime
View raw message