avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Carey <scottca...@apache.org>
Subject Re: Multiple Input for Avro jobs
Date Wed, 08 Feb 2012 22:53:15 GMT
Unfortunately, I am not familiar with how MultiInput works.  You may be able
to compose it with AvroInputFormat within your own InputFormat to get the
required results, but someone with more hadoop input experience would know
more.

On 2/8/12 2:45 PM, "Serge Blazhievsky" <easyvoip@gmail.com> wrote:

> Thanks for replay, Scott!!
> 
> I need to assign different mapper instances for each directory.
> 
> Something similar to MultiInput.addPath
> 
> Any suggestions?
> 
> 
> Thanks
> Serge
> 
> 
> 
> On Wed, Feb 8, 2012 at 1:26 PM, Scott Carey <scottcarey@apache.org> wrote:
>> If you are after only multiple paths, path globs work.
>> For example to read both /logs/2012_01  and /logs/2012_02 use the glob path:
>> /logs/2012_0{1,2}
>> 
>> And to read the four paths /logs/2011_01, /logs/2011_02/, logs/2012_01,  and
>> /logs/2012_02
>> /logs/201{1,2}_0{1,2}
>> 
>> '*' is a wildcard as well, e.g. /logs/2011_*/
>> 
>> 
>> If you  need a mapper instance per directory or different split assignment
>> there would be more work involved.
>> 
>> On 2/8/12 12:24 PM, "Serge Blazhievsky" <easyvoip@gmail.com> wrote:
>> 
>>> Hi all,
>>> 
>>> I am trying to assign different mapper to different folders.
>>> 
>>> Is there an equivalent of Multiinputs for avro
>>> 
>>> 
>>>   MultipleInputs.addInputPath(job, new Path(input),
>>> AvroInputFormat<GenericRecord>.class, MapImpl.class);
>>> 
>>> 
>>> Thanks
>>> Serge
>>>      
> 



Mime
View raw message