Return-Path: X-Original-To: apmail-avro-dev-archive@www.apache.org Delivered-To: apmail-avro-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 76A811052A for ; Wed, 8 May 2013 17:47:17 +0000 (UTC) Received: (qmail 36153 invoked by uid 500); 8 May 2013 17:47:16 -0000 Delivered-To: apmail-avro-dev-archive@avro.apache.org Received: (qmail 36062 invoked by uid 500); 8 May 2013 17:47:16 -0000 Mailing-List: contact dev-help@avro.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@avro.apache.org Delivered-To: mailing list dev@avro.apache.org Received: (qmail 35918 invoked by uid 99); 8 May 2013 17:47:16 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 May 2013 17:47:16 +0000 Date: Wed, 8 May 2013 17:47:16 +0000 (UTC) From: "Doug Cutting (JIRA)" To: dev@avro.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (AVRO-1326) Files written with bzip2 codec cannot be read MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/AVRO-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Cutting updated AVRO-1326: ------------------------------- Priority: Critical (was: Minor) > Files written with bzip2 codec cannot be read > --------------------------------------------- > > Key: AVRO-1326 > URL: https://issues.apache.org/jira/browse/AVRO-1326 > Project: Avro > Issue Type: Bug > Components: java > Affects Versions: 1.7.4 > Reporter: Kevin Irwin > Assignee: Doug Cutting > Priority: Critical > Fix For: 1.7.5 > > Attachments: AVRO-1326.patch, BzipTest.java > > > When attempting to read a file written using the bzip2 codec for compression, the following exception is thrown upon completion of the first encoded block: > Exception in thread "main" org.apache.avro.AvroRuntimeException: java.io.IOException: Block read partially, the data may be corrupt > at org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:210) > at BzipTests.main(BzipTests.java:28) > Caused by: java.io.IOException: Block read partially, the data may be corrupt > at org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:194) > ... 1 more > An inspection of BZip2Codec indicates the root cause is in the compress() method. The entire supplied ByteBuffer is compressed, not just the valid portion of the buffer. On decompress, the resultant length is then larger than the recorded uncompressed block size. > On line 51: > outputStream.write(uncompressedData.array()); > should be: > outputStream.write(uncompressedData.array(), uncompressedData.position(), uncompressedData.remaining()); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira