hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo (Nicholas), SZE (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-7333) Performance improvement in PureJavaCrc32
Date Thu, 26 May 2011 17:27:47 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-7333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13039807#comment-13039807
] 

Tsz Wo (Nicholas), SZE commented on HADOOP-7333:
------------------------------------------------

Eric, good observation!  I just have tried your patch and got the following results.  Could
you also post your results?  You may simply copy and paste the text output on the JIRA.

- With patch:
java.version = 1.6.0_10
java.runtime.name = Java(TM) SE Runtime Environment
java.runtime.version = 1.6.0_10-b33
java.vm.version = 11.0-b15
java.vm.vendor = Sun Microsystems Inc.
java.vm.name = Java HotSpot(TM) 64-Bit Server VM
java.vm.specification.version = 1.0
java.specification.version = 1.6
os.arch = amd64
os.name = Linux
os.version = 2.6.9-55.ELsmp

Performance Table (The unit is MB/sec)
|| Num Bytes ||    CRC32 || PureJavaCrc32 ||
|          1 |     7.673 |        105.282 |
|          2 |    14.646 |        131.429 |
|          4 |    28.523 |        115.807 |
|          8 |    51.734 |        281.044 |
|         16 |    87.154 |        296.732 |
|         32 |   131.368 |        338.979 |
|         64 |   181.210 |        363.733 |
|        128 |   219.660 |        376.653 |
|        256 |   247.728 |        383.204 |
|        512 |   263.704 |        387.144 |
|       1024 |   271.574 |        387.665 |
|       2048 |   276.787 |        389.429 |
|       4096 |   279.420 |        390.218 |
|       8192 |   280.808 |        390.630 |
|      16384 |   280.895 |        388.867 |
|      32768 |   280.641 |        386.111 |
|      65536 |   280.908 |        386.008 |
|     131072 |   281.049 |        386.089 |
|     262144 |   281.117 |        386.124 |
|     524288 |   281.133 |        386.130 |
|    1048576 |   281.193 |        386.013 |
|    2097152 |   281.208 |        385.508 |
|    4194304 |   280.439 |        384.409 |
|    8388608 |   279.023 |        382.025 |
|   16777216 |   278.610 |        381.445 |

- Without patch:
java.version = 1.6.0_10
java.runtime.name = Java(TM) SE Runtime Environment
java.runtime.version = 1.6.0_10-b33
java.vm.version = 11.0-b15
java.vm.vendor = Sun Microsystems Inc.
java.vm.name = Java HotSpot(TM) 64-Bit Server VM
java.vm.specification.version = 1.0
java.specification.version = 1.6
os.arch = amd64
os.name = Linux
os.version = 2.6.9-55.ELsmp

Performance Table (The unit is MB/sec)
|| Num Bytes ||    CRC32 || PureJavaCrc32 ||
|          1 |     7.812 |         78.613 |
|          2 |    15.149 |        135.446 |
|          4 |    28.214 |        144.649 |
|          8 |    50.598 |        292.826 |
|         16 |    87.112 |        294.373 |
|         32 |   132.665 |        354.003 |
|         64 |   179.303 |        387.926 |
|        128 |   218.238 |        408.221 |
|        256 |   246.192 |        418.157 |
|        512 |   262.853 |        424.399 |
|       1024 |   271.822 |        425.695 |
|       2048 |   276.041 |        428.904 |
|       4096 |   279.137 |        428.804 |
|       8192 |   280.552 |        429.013 |
|      16384 |   280.489 |        429.608 |
|      32768 |   280.647 |        426.743 |
|      65536 |   281.974 |        427.366 |
|     131072 |   282.104 |        427.439 |
|     262144 |   282.079 |        427.408 |
|     524288 |   282.252 |        427.355 |
|    1048576 |   282.310 |        427.171 |
|    2097152 |   282.136 |        426.867 |
|    4194304 |   280.107 |        425.458 |
|    8388608 |   280.020 |        422.325 |
|   16777216 |   279.599 |        421.762 |


> Performance improvement in PureJavaCrc32
> ----------------------------------------
>
>                 Key: HADOOP-7333
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7333
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: util
>    Affects Versions: 0.21.0
>         Environment: Linux x64
>            Reporter: Eric Caspole
>            Assignee: Eric Caspole
>            Priority: Minor
>         Attachments: HADOOP-7333.patch
>
>
> I would like to propose a small patch to 
>   org.apache.hadoop.util.PureJavaCrc32.update(byte[] b, int off, int len)
> Currently the method stores the intermediate result back into the data member "crc."
I noticed this method gets
> inlined into DataChecksum.update() and that method appears as one of the hotter methods
in a simple hprof profile collected while running terasort and gridmix.
> If the code is modified to save the temporary result into a local and just once store
the final result back into the data member, it results in slightly more efficient hotspot
codegen.
> I tested this change using the the "org.apache.hadoop.util.TestPureJavaCrc32$PerformanceTest"
which is embedded in the existing unit test for this class, TestPureJavaCrc32 on a variety
of linux x64 AMD and Intel multi-socket and multi-core systems I have available to test.
> The patch removes several stores of the intermediate result to memory yielding a 0%-10%
speedup in the "org.apache.hadoop.util.TestPureJavaCrc32$PerformanceTest" which is embedded
in the existing unit test for this class, TestPureJavaCrc32.
>  
> If you use a debug hotspot JVM with -XX:+PrintOptoAssembly, you can see the intermediate
stores such as:
> 414     movq    R9, [rsp + #24] # spill
> 419     movl    [R9 + #12 (8-bit)], RDX # int ! Field PureJavaCrc32.crc
> 41d     xorl    R10, RDX        # int
> The patch results in just one final store of the fully computed value.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message