hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anna Lahoud <annalah...@gmail.com>
Subject Re: CombineFileInputFormat and mapreduce in v20.2
Date Fri, 28 Sep 2012 12:17:07 GMT
Thank you Bejoy and Chris! Fabulous idea that I will definitely use. And I
really appreciate the tips to make it go a little smoother, as well.

On Thu, Sep 27, 2012 at 5:39 PM, Chris Nauroth <cnauroth@hortonworks.com>wrote:

> Hi Anna,
> Just to second Bejoy's comments, that's an approach that I used
> successfully on a project a year or two ago.  Plan on a day or two to get
> the port fully working and tested on your cluster.  Once you start porting
> in CombineFileInputFormat, you'll probably find that you need to start
> porting in additional classes that it depends on.  (I'm sorry that I don't
> have access to my port of the code anymore, so I can't just hand it over.)
> Also, make sure that whatever version you port from includes the fix for
> the infinite loop bug.  Here are 2 old JIRAs that tracked patches to fix
> the infinite loop:
> https://issues.apache.org/jira/browse/MAPREDUCE-2185
> https://issues.apache.org/jira/browse/MAPREDUCE-2862
> Thank you,
> --Chris
> On Thu, Sep 27, 2012 at 1:53 PM, Bejoy Ks <bejoy.hadoop@gmail.com> wrote:
>> Hi Anna
>> One option I can think of is getting the CombineFileInputFormat from the
>> latest release add it as a Custom Input format in your application code and
>> ship it with your map reduce appl jar. Similar to how you'll implement a
>> input format of your own and use it with map reduce.
>> Regards
>> Bejoy KS

View raw message