avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Serge Blazhievsky <easyv...@gmail.com>
Subject Re: Multiple Input for Avro jobs
Date Wed, 08 Feb 2012 23:01:32 GMT
Yes, I have been trying to look into AvroInputFormat

Can you point me to some examples of AvroInputFormat usage?


Thnaks

Serge

On Wed, Feb 8, 2012 at 2:53 PM, Scott Carey <scottcarey@apache.org> wrote:

> Unfortunately, I am not familiar with how MultiInput works.  You may be
> able to compose it with AvroInputFormat within your own InputFormat to get
> the required results, but someone with more hadoop input experience would
> know more.
>
> On 2/8/12 2:45 PM, "Serge Blazhievsky" <easyvoip@gmail.com> wrote:
>
> Thanks for replay, Scott!!
>
> I need to assign different mapper instances for each directory.
>
> Something similar to MultiInput.addPath
>
> Any suggestions?
>
>
> Thanks
> Serge
>
>
>
> On Wed, Feb 8, 2012 at 1:26 PM, Scott Carey <scottcarey@apache.org> wrote:
>
>> If you are after only multiple paths, path globs work.
>> For example to read both /logs/2012_01  and /logs/2012_02 use the glob
>> path:
>> /logs/2012_0{1,2}
>>
>> And to read the four paths /logs/2011_01, /logs/2011_02/, logs/2012_01,
>>  and /logs/2012_02
>> /logs/201{1,2}_0{1,2}
>>
>> '*' is a wildcard as well, e.g. /logs/2011_*/
>>
>>
>> If you  need a mapper instance per directory or different split
>> assignment there would be more work involved.
>>
>> On 2/8/12 12:24 PM, "Serge Blazhievsky" <easyvoip@gmail.com> wrote:
>>
>> Hi all,
>>
>> I am trying to assign different mapper to different folders.
>>
>> Is there an equivalent of Multiinputs for avro
>>
>>
>>   MultipleInputs.addInputPath(job, new Path(input),
>> AvroInputFormat<GenericRecord>.class, MapImpl.class);
>>
>>
>> Thanks
>> Serge
>>
>>
>>
>

Mime
View raw message