hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward Nevill (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-11660) Add support for hardware crc of HDFS checksums on ARM aarch64 architecture
Date Sat, 13 Jun 2015 18:57:01 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-11660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14584776#comment-14584776
] 

Edward Nevill commented on HADOOP-11660:
----------------------------------------

Hi Andrew,

Yes. Please see my comments of 4/Mar/15.

The version of the CRC patch checked into Hadoop does support pipelining.

On A57 without pipelining the raw CRC speed was 4.5X better. With pipelining it was 11X better.

Regards,
Ed.


> Add support for hardware crc of HDFS checksums on ARM aarch64 architecture
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-11660
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11660
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: native
>    Affects Versions: 2.8.0
>         Environment: ARM aarch64 development platform
>            Reporter: Edward Nevill
>            Assignee: Edward Nevill
>            Priority: Minor
>              Labels: performance
>             Fix For: 2.8.0
>
>         Attachments: jira-11660.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> This patch adds support for hardware crc for ARM's new 64 bit architecture
> The patch is completely conditionalized on __aarch64__
> I have only added support for the non pipelined version as I benchmarked the pipelined
version on aarch64 and it showed no performance improvement.
> The aarch64 version supports both Castagnoli and Zlib CRCs as both of these are supported
on ARM aarch64 hardwre.
> To benchmark this I modified the test_bulk_crc32 test to print out the time taken to
CRC a 1MB dataset 1000 times.
> Before:
> CRC 1048576 bytes @ 512 bytes per checksum X 1000 iterations = 2.55
> CRC 1048576 bytes @ 512 bytes per checksum X 1000 iterations = 2.55
> After:
> CRC 1048576 bytes @ 512 bytes per checksum X 1000 iterations = 0.57
> CRC 1048576 bytes @ 512 bytes per checksum X 1000 iterations = 0.57
> So this represents a 5X performance improvement on raw CRC calculation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message