hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Owen O'Malley <o...@yahoo-inc.com>
Subject Re: SequenceFile (Text,Text) becomes plain text
Date Fri, 02 Feb 2007 23:58:47 GMT

On Feb 2, 2007, at 2:46 PM, Bryan A. P. Pendleton wrote:

> Note that, unless there are no tab characters in the keys of the  
> output from
> the first job, there's no way to read the existing output  
> accurately back
> in.

*Sigh* That asymmetry in Text{In,Out}putFormat has bothered me for a  
while now. I think at some point, we should do a TabText{In,Out} 
putFormat that looks like:

<key>\t<value>\n with tabs and newlines escaped in the keys and values.

That will give us a symmetric set of text formats. Furthermore, I'd  
say that if value == NULL, the tab should be left off.

View raw message