avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Carey (JIRA)" <j...@apache.org>
Subject [jira] Updated: (AVRO-541) Java: TestDataFileConcat sometimes fails
Date Thu, 12 Aug 2010 03:48:17 GMT

     [ https://issues.apache.org/jira/browse/AVRO-541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Scott Carey updated AVRO-541:

    Attachment: AVRO-541.patch

This patch addresses the issue here.

Furthermore, it cleans up and refactors DataFileStream and DataFileWriter a little bit, encapsulating
block write, decode, and encode work in DataFileStream.DataBlock for consistency.

The bug here was caused by a quirk in how Inflater.java works.   This quirk ONLY affects deflate
with 'nowrap' mode.  Simply changing nowrap to false stops this bug, but is not up to spec.

The simplest work-around was to use InflaterOutputStream instead of InflaterInputStream. 
This also allows for sharing more code between compress() and decompress().

The OutputStream variations avoid the complexity of having to deal with detecting the end
of the stream that happens with the read() methods of the OutputStream interface, making it
all much simpler, both in our code and in the internals of InflaterOutputStream and DeflaterOutputStream
compared to the InputStream variants.   Its just easier to 'push' to the Inflate and Deflate
API than to pull.

For some information on the sorts of things that were happening, see this Java bug: 

The work-arounds there do not work well for a case where the end of the array is not guaranteed
to be the end of the stream, which it is not when abstracted through a ByteBuffer for input
in decompress().

> Java: TestDataFileConcat sometimes fails
> ----------------------------------------
>                 Key: AVRO-541
>                 URL: https://issues.apache.org/jira/browse/AVRO-541
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>            Reporter: Doug Cutting
>            Assignee: Scott Carey
>            Priority: Critical
>             Fix For: 1.4.0
>         Attachments: AVRO-541.patch, AVRO-541.patch
> TestDataFileConcat intermittently fails with:
> {code}
> Testcase: testConcateateFiles[5] took 0.032 sec
>         Caused an ERROR
> java.io.IOException: Block read partially, the data may be corrupt
> org.apache.avro.AvroRuntimeException: java.io.IOException: Block read partially, the
data may be corrupt
>         at org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:173)
>         at org.apache.avro.file.DataFileStream.next(DataFileStream.java:193)
>         at org.apache.avro.TestDataFileConcat.testConcateateFiles(TestDataFileConcat.java:141)
> Caused by: java.io.IOException: Block read partially, the data may be corrupt
>         at org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:157)
> {code}

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message