hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Venner <ja...@attributor.com>
Subject Re: one input file per map
Date Thu, 03 Jul 2008 13:19:39 GMT
You could also set your input split size to Long.MAX_VALUE.

Goel, Ankur wrote:
> Nope, But if the intent is so then there are 2 ways of doing it.
>
> 1. Just extend the input format of your choice and override
> isSplitable() method to return false.
>
> 2. Compress your text file using a compression format supported by
> hadoop (e.g gzip). This will ensure that one map task processes 1 file
> since compressed files are not split between processes.
>
>
> -----Original Message-----
> From: Qiong Zhang [mailto:jamesz@yahoo-inc.com] 
> Sent: Tuesday, July 01, 2008 9:54 PM
> To: core-user@hadoop.apache.org
> Subject: one input file per map 
>
> Hi,
>
>  
>
> Is there an existing input format/split which supports one input file
> (e.g. plain text) per map task?
>
>  
>
> Thanks,
>
> James
>
>   

Mime
View raw message