hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-9601) Support native CRC on byte arrays
Date Wed, 05 Jun 2013 15:52:20 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-9601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676055#comment-13676055
] 

Todd Lipcon commented on HADOOP-9601:
-------------------------------------

Looks like a good start. A few thoughts on performance:

- rather than using JNI to call back into ByteBuffer.hasArray, I think it would be better
to introduce a new native call which just takes the array directly. The "call backs" into
Java functions from JNI are going to be much slower since they don't get inlined, etc, whereas
the "hasArray()" checks from Java will be JITted nicely.

- In the case of arrays, we should "chunk" the GetPrimitiveArrayCritical calls to not grab
more than maybe 256KB at a time. Otherwise you can run into issues where CRC calculation in
one thread blocks all other threads at a pre-GC safepoint. That was one of the reasons we
switched to Pure Java CRC a couple years back.

Did you try running the CRC benchmark tests? I think there are some floating around that compare
direct buffer CRC performance vs array, etc.
                
> Support native CRC on byte arrays
> ---------------------------------
>
>                 Key: HADOOP-9601
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9601
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: performance, util
>    Affects Versions: 3.0.0
>            Reporter: Todd Lipcon
>         Attachments: HADOOP-9601-WIP-01.patch
>
>
> When we first implemented the Native CRC code, we only did so for direct byte buffers,
because these correspond directly to native heap memory and thus make it easy to access via
JNI. We'd generally assumed that accessing byte[] arrays from JNI was not efficient enough,
but now that I know more about JNI I don't think that's true -- we just need to make sure
that the critical sections where we lock the buffers are short.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message