avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Carey (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AVRO-798) add checksums to Snappy codec
Date Thu, 07 Apr 2011 00:34:05 GMT

    [ https://issues.apache.org/jira/browse/AVRO-798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016630#comment-13016630
] 

Scott Carey commented on AVRO-798:
----------------------------------

Should we use PureJavaCRC32?  Its a lot faster, though less so for larger arrays.


http://svn.apache.org/viewvc/hadoop/common/trunk/src/java/org/apache/hadoop/util/PureJavaCrc32.java?revision=953881&view=markup

We can just copy that code into a class in o.a.a.util.

> add checksums to Snappy codec
> -----------------------------
>
>                 Key: AVRO-798
>                 URL: https://issues.apache.org/jira/browse/AVRO-798
>             Project: Avro
>          Issue Type: Improvement
>          Components: java, spec
>            Reporter: Doug Cutting
>            Assignee: Doug Cutting
>             Fix For: 1.5.1
>
>         Attachments: AVRO-798.patch, AVRO-798.patch
>
>
> A checksum might be included with each compressed block to better detect errors.  While
some filesystems (e.g. HDFS) may checksum data, not all do.  Data files may also accumulate
errors when copied between filesystems.  For back-compatibility, we cannot easily add checksums
to existing data files, but a new codec provides us with the opportunity to do this.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message