cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aleksey Yeschenko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8684) Replace usage of Adler32 with CRC32
Date Mon, 10 Aug 2015 16:22:46 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14680315#comment-14680315
] 

Aleksey Yeschenko commented on CASSANDRA-8684:
----------------------------------------------

Haven't looked deeply enough at the code to say that it works or is broken, but I've stumbled
upon a an issue with {{CRC32}} and {{Checksum}} in general that is a bit surprising:

Docs:
{{code}}
    /**
     * Updates the checksum with the bytes from the specified buffer.
     *
     * The checksum is updated using
     * buffer.{@link java.nio.Buffer#remaining() remaining()}
     * bytes starting at
     * buffer.{@link java.nio.Buffer#position() position()}
     * Upon return, the buffer's position will
     * be updated to its limit; its limit will not have been changed.
     *
     * @param buffer the ByteBuffer to update the checksum with
     * @since 1.8
     */
{{code}}

TL;DR: Unlike our previous {{ICRC32}} interface, {{CRC32}} (and {{Adler32}}) will set BB's
{{position}} to {{limit}}, in place, upon method exit.

> Replace usage of Adler32 with CRC32
> -----------------------------------
>
>                 Key: CASSANDRA-8684
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8684
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Ariel Weisberg
>            Assignee: Ariel Weisberg
>             Fix For: 3.0 beta 1
>
>         Attachments: CRCBenchmark.java, PureJavaCrc32.java, Sample.java
>
>
> I could not find a situation in which Adler32 outperformed PureJavaCrc32 much less the
intrinsic from Java 8. For small allocations PureJavaCrc32 was much faster probably due to
the JNI overhead of invoking the native Adler32 implementation where the array has to be allocated
and copied.
> I tested on a 65w Sandy Bridge i5 running Ubuntu 14.04 with JDK 1.7.0_71 as well as a
c3.8xlarge running Ubuntu 14.04.
> I think it makes sense to stop using Adler32 when generating new checksums.
> c3.8xlarge, results are time in milliseconds, lower is better
> ||Allocation size|Adler32|CRC32|PureJavaCrc32||
> |64|47636|46075|25782|
> |128|36755|36712|23782|
> |256|31194|32211|22731|
> |1024|27194|28792|22010|
> |1048576|25941|27807|21808|
> |536870912|25957|27840|21836|
> i5
> ||Allocation size|Adler32|CRC32|PureJavaCrc32||
> |64|50539|50466|26826|
> |128|37092|38533|24553|
> |256|30630|32938|23459|
> |1024|26064|29079|22592|
> |1048576|24357|27911|22481|
> |536870912|24838|28360|22853|
> Another fun fact. Performance of the CRC32 intrinsic appears to double from Sandy Bridge
-> Haswell. Unless I am measuring something different when going from Linux/Sandy to Haswell/OS
X.
> The intrinsic/JDK 8 implementation also operates against DirectByteBuffers better and
coding against the wrapper will get that boost when run with Java 8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message