hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: isSplitable() deprecated
Date Fri, 08 Jan 2010 21:13:26 GMT
The input file is in .gz format
FYI

On Fri, Jan 8, 2010 at 11:08 AM, Ted Yu <yuzhihong@gmail.com> wrote:

> My current project processes input file of size 333302161 bytes.
> What I plan to do is to split the file into equal size pieces (and on blank
> line boundary) to improve performance.
>
> I found 12 classes in 0.20.1 source code which implement InputSplit.
>
> If someone has written code similar to what I plan to do, please share some
> hint.
>
> Thanks
>
>
> On Fri, Jan 8, 2010 at 2:27 AM, Amogh Vasekar <amogh@yahoo-inc.com> wrote:
>
>> Hi,
>> The deprecation is due to the new evolving mapreduce ( o.a.h.mapreduce )
>> APIs. Old APIs are supported for available distributions. The equivalent of
>> TextInputFormat is available in new API :
>>
>>
>> http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/lib/input/TextInputFormat.html
>>
>> Thanks,
>> Amogh
>>
>>
>> On 1/8/10 3:47 AM, "Ted Yu" <yuzhihong@gmail.com> wrote:
>>
>> According to:
>>
>> http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/TextInputFormat.html#isSplitable%28org.apache.hadoop.fs.FileSystem,%20org.apache.hadoop.fs.Path%29
>>
>> isSplitable() is deprecated.
>>
>> Which method should I use to replace it ?
>>
>> Thanks
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message