hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: question about file split
Date Thu, 16 Aug 2012 18:29:25 GMT
Weishung,

For text files, this is done by the LineRecordReader.

See http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java?view=markup.
Specifically see L126-L131 and lines and the loop around L164 onwards.
These parts of the logic correlate with the logic described at
http://wiki.apache.org/hadoop/HadoopMapReduce.

On Thu, Aug 16, 2012 at 11:48 PM, Weishung Chung <weishung@gmail.com> wrote:
> Hey fellow developers,
>
> I am trying to figure out in the code base, which class does the handling of
> record running across block boundary when reading a file split. I have been
> digging through LineRecordReader, FileInputFormat, TextInputFormat, and etc.
>
> Thank you,
> Wei Shung



-- 
Harsh J

Mime
View raw message