avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Serge Blazhievsky <easyv...@gmail.com>
Subject Re: Multiple Input for Avro jobs
Date Wed, 08 Feb 2012 22:45:35 GMT
Thanks for replay, Scott!!

I need to assign different mapper instances for each directory.

Something similar to MultiInput.addPath

Any suggestions?


Thanks
Serge



On Wed, Feb 8, 2012 at 1:26 PM, Scott Carey <scottcarey@apache.org> wrote:

> If you are after only multiple paths, path globs work.
> For example to read both /logs/2012_01  and /logs/2012_02 use the glob
> path:
> /logs/2012_0{1,2}
>
> And to read the four paths /logs/2011_01, /logs/2011_02/, logs/2012_01,
>  and /logs/2012_02
> /logs/201{1,2}_0{1,2}
>
> '*' is a wildcard as well, e.g. /logs/2011_*/
>
>
> If you  need a mapper instance per directory or different split assignment
> there would be more work involved.
>
> On 2/8/12 12:24 PM, "Serge Blazhievsky" <easyvoip@gmail.com> wrote:
>
> Hi all,
>
> I am trying to assign different mapper to different folders.
>
> Is there an equivalent of Multiinputs for avro
>
>
>   MultipleInputs.addInputPath(job, new Path(input),
> AvroInputFormat<GenericRecord>.class, MapImpl.class);
>
>
> Thanks
> Serge
>
>
>

Mime
View raw message