hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sonal Goyal <sonalgoy...@gmail.com>
Subject Re: Using a custom FileSplitter?
Date Wed, 23 Jun 2010 18:00:27 GMT
Hi Steve,

Please check FileInputFormat.setInputPathFilter() to choose which file
patterns you want to select for your job.

If you want to pass a whole file as an input to your mapper, you can
create your own InputFormat by subclassing FileInputFormat and
override the isSplitable() method.

Thanks and Regards,

On Wed, Jun 23, 2010 at 10:51 PM, Steve Lewis <lordjoe2000@gmail.com> wrote:
> Assume I have one of the two situations (I have both)
> 1) I have a directory with several hundred files - of these some fraction
> need to be passed to the mapper (say the ones ending in ".foo") and the
> others
>    can be ignored. Assume I am incapable or unwilling to create a directory
> containing only the files that I need - how do I set up a custom file
> splitter using Java code
>    to filter my files.
> 2) Assume I have a collection of files which are not splittable so I will
> use one file per mapper. Assume that special code is required to read the
> file and convert it into lines of
>  text and that I have Java code to do that. Same question - how do I install
> a custom file splitter to decode files in a custom manner?
> --
> Steven M. Lewis PhD
> Institute for Systems Biology
> Seattle WA

View raw message