hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Malloy <alan.mal...@yieldbuild.com>
Subject Re: Losing Records with Block Compressed Sequence File
Date Fri, 21 Jan 2011 23:43:09 GMT
Make sure to close the output writer? I had similar problems in a 
different scenario and it turned out I was neglecting to close/flush my 
output.

On 01/21/2011 01:04 PM, David Sinclair wrote:
> Hi, I am seeing an odd problem when writing block compressed sequence files.
> If I write 400,000 records into a sequence file w/o compression, all 400K
> end up in the file. If I write with block, regardless if it is bz2 or
> deflate, I start losing records. Not a ton, but a couple hundred.
>
> Here are the exact numbers
>
> bz2      399,734
> deflate  399,770
> none     400,000
>
> Conf settings
> io.file.buffer.size - 4K, io.seqfile.compress.blocksize - 1MB
>
> anyone ever see this behavior?
>
> thanks
>
> dave
>

Mime
View raw message