hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <tdunn...@veoh.com>
Subject Re: splitting of big files?
Date Tue, 27 May 2008 17:57:02 GMT

The input format chosen determines the semantics of the input file.

On 5/27/08 9:46 AM, "koara@atlas.cz" <koara@atlas.cz> wrote:

> How does the application know that the file is 'text' though (i.e. when is new
> line a special character)? Or are all files assumed to be text?
> And even when they are, how do different newline representations come into
> play (CR/LF/CRLF/NEL/unicode posse)?
> Just curious, the main point was already answered by Dough and Andreas, many
> thanks for that.

View raw message