hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Nauroth <cnaur...@hortonworks.com>
Subject Re: CombineFileInputFormat and mapreduce in v20.2
Date Thu, 27 Sep 2012 21:39:06 GMT
Hi Anna,

Just to second Bejoy's comments, that's an approach that I used
successfully on a project a year or two ago.  Plan on a day or two to get
the port fully working and tested on your cluster.  Once you start porting
in CombineFileInputFormat, you'll probably find that you need to start
porting in additional classes that it depends on.  (I'm sorry that I don't
have access to my port of the code anymore, so I can't just hand it over.)

Also, make sure that whatever version you port from includes the fix for
the infinite loop bug.  Here are 2 old JIRAs that tracked patches to fix
the infinite loop:

https://issues.apache.org/jira/browse/MAPREDUCE-2185

https://issues.apache.org/jira/browse/MAPREDUCE-2862

Thank you,
--Chris

On Thu, Sep 27, 2012 at 1:53 PM, Bejoy Ks <bejoy.hadoop@gmail.com> wrote:

> Hi Anna
>
> One option I can think of is getting the CombineFileInputFormat from the
> latest release add it as a Custom Input format in your application code and
> ship it with your map reduce appl jar. Similar to how you'll implement a
> input format of your own and use it with map reduce.
>
> Regards
> Bejoy KS
>

Mime
View raw message