hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <qwertyman...@gmail.com>
Subject Re: HDFS file content restrictions
Date Fri, 04 Mar 2011 20:29:59 GMT
The class responsible for reading records as lines off a file, seek in
to the next block in sequence until the newline. This behavior, and
how it affects the Map tasks, is better documented here (see the
TextInputFormat example doc):

On Sat, Mar 5, 2011 at 1:54 AM, Kelly Burkhart <kelly.burkhart@gmail.com> wrote:
> On Fri, Mar 4, 2011 at 1:42 PM, Harsh J <qwertymaniac@gmail.com> wrote:
>> HDFS does not operate with records in mind.
> So does that mean that HDFS will break a file at exactly <blocksize>
> bytes?  Map/Reduce *does* operate with records in mind, so what
> happens to the split record?  Does HDFS put the fragments back
> together and deliver the reconstructed record to one map?  Or are both
> fragments and consequently the whole record discarded?
> Thanks,
> -Kelly

Harsh J

View raw message