hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sonal Goyal <sonalgoy...@gmail.com>
Subject Re: Using a custom FileSplitter?
Date Wed, 23 Jun 2010 18:00:27 GMT
Hi Steve,

Please check FileInputFormat.setInputPathFilter() to choose which file
patterns you want to select for your job.

If you want to pass a whole file as an input to your mapper, you can
create your own InputFormat by subclassing FileInputFormat and
override the isSplitable() method.

Thanks and Regards,
Sonal
www.meghsoft.com
http://in.linkedin.com/in/sonalgoyal



On Wed, Jun 23, 2010 at 10:51 PM, Steve Lewis <lordjoe2000@gmail.com> wrote:
> Assume I have one of the two situations (I have both)
> 1) I have a directory with several hundred files - of these some fraction
> need to be passed to the mapper (say the ones ending in ".foo") and the
> others
>    can be ignored. Assume I am incapable or unwilling to create a directory
> containing only the files that I need - how do I set up a custom file
> splitter using Java code
>    to filter my files.
> 2) Assume I have a collection of files which are not splittable so I will
> use one file per mapper. Assume that special code is required to read the
> file and convert it into lines of
>  text and that I have Java code to do that. Same question - how do I install
> a custom file splitter to decode files in a custom manner?
>
> --
> Steven M. Lewis PhD
> Institute for Systems Biology
> Seattle WA
>

Mime
View raw message