hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-11927) If java7, use zip crc
Date Thu, 11 Sep 2014 20:48:34 GMT

     [ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

stack updated HBASE-11927:
    Attachment: crc32ct.svg

So, messing with compactiontool on an hdfs cluster, I see 20% of CPU given over to creating
checksums but none verifying them!  Whats up?

Turns out, in this 'tool' context, HDFS is doing the verification of the checksums.  I am
running on top of branch-2 HDFS so have the latest native crc improvements.  You can see the
native calls if you try hard.  They are to the right of the second peak in this flame graph.
 There are not many samples but its showing as 0.9 percent as opposed to a 20% you can see
in the first graphs I posted when the flame graphs are taken against a running hbase regionserver.

Let me see if I can get hbase to use the native checksum making and verifying if it is available.

> If java7, use zip crc
> ---------------------
>                 Key: HBASE-11927
>                 URL: https://issues.apache.org/jira/browse/HBASE-11927
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>             Fix For: 0.99.1
>         Attachments: c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg
> Up in hadoop they have this change. Let me publish some graphs to show that it makes
a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because
of compacting, flushing, etc.).  We should also make use of native CRCings -- especially the
2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now.

This message was sent by Atlassian JIRA

View raw message