hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin <klz...@gmail.com>
Subject Re: Are lines broken in dfs and/or in InputSplit
Date Wed, 16 Jul 2008 23:20:21 GMT
I tried a bit and it looks that lines are preserved so far. However,
is this property supported for sure, or what should I do to keep it
works in this way? Thank you.


On Tue, Jul 15, 2008 at 5:07 PM, Kevin <klzhao@gmail.com> wrote:
> Hi,
> I was trying to parse text input with line-based information in mapper
> and this problem becomes an issue. I wonder if lines are preserved or
> broken when a file is cut into blocks by dfs. Also, it looks that
> although TextInputFormat breaks file into lines records, the
> InputSplit passed to InputFormat may not preserve lines. If this is
> the case, is it possible to restore the lines for mapper input, or I
> have to drop broken lines? Thank you.
> Best,
> -Kevin

View raw message