[ https://issues.apache.org/jira/browse/HADOOP-7333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13039952#comment-13039952
]
Tsz Wo (Nicholas), SZE commented on HADOOP-7333:
------------------------------------------------
Tried 6 cases: three different machines, two JVMs (the latest 1.6.0_25-b06 and an earlier
version) for each machine. The patch always improves the performance on the latest JVM (1.6.0_25-b06)
but degrades the performance on some earlier JVMs.
In the table below, *CRC32* is {{java.util.zip.CRC32}}, *PureJavaCrc32* is the current code
in trunk and *H7333* is the current code with the patch .
-----
h3. 1.1) Linux 2.6.9-55.ELsmp - Java 1.6.0_10
java.version = 1.6.0_10
java.runtime.name = Java(TM) SE Runtime Environment
java.runtime.version = 1.6.0_10-b33
java.vm.version = 11.0-b15
java.vm.vendor = Sun Microsystems Inc.
java.vm.name = Java HotSpot(TM) 64-Bit Server VM
java.vm.specification.version = 1.0
java.specification.version = 1.6
os.arch = amd64
os.name = Linux
os.version = 2.6.9-55.ELsmp
Performance Table (The unit is MB/sec)
|| Num Bytes || CRC32 || PureJavaCrc32 || H7333 ||
| 1 | 7.842 | 72.924 | 82.086 |
| 2 | 15.270 | 106.023 | 121.040 |
| 4 | 27.828 | 103.828 | 119.558 |
| 8 | 52.146 | 263.903 | 253.742 |
| 16 | 85.452 | 283.530 | 280.151 |
| 32 | 130.481 | 343.833 | 325.426 |
| 64 | 177.955 | 380.677 | 354.599 |
| 128 | 217.593 | 404.632 | 372.172 |
| 256 | 243.913 | 416.707 | 379.768 |
| 512 | 261.980 | 423.663 | 384.451 |
| 1024 | 271.463 | 425.220 | 386.562 |
| 2048 | 276.924 | 429.696 | 389.164 |
| 4096 | 279.536 | 430.665 | 390.104 |
| 8192 | 280.603 | 431.043 | 390.557 |
| 16384 | 281.350 | 431.056 | 389.888 |
| 32768 | 281.492 | 427.595 | 387.046 |
| 65536 | 281.838 | 426.990 | 385.032 |
| 131072 | 281.953 | 427.164 | 386.727 |
| 262144 | 282.085 | 427.328 | 387.082 |
| 524288 | 282.137 | 427.503 | 387.093 |
| 1048576 | 282.076 | 427.465 | 387.000 |
| 2097152 | 282.106 | 427.219 | 386.776 |
| 4194304 | 281.215 | 425.642 | 385.575 |
| 8388608 | 280.048 | 422.767 | 383.008 |
| 16777216 | 279.671 | 422.120 | 382.612 |
h3. 1.2) Linux 2.6.9-55.ELsmp - Java 1.6.0_25-b06
java.version = 1.6.0_25
java.runtime.name = Java(TM) SE Runtime Environment
java.runtime.version = 1.6.0_25-b06
java.vm.version = 20.0-b11
java.vm.vendor = Sun Microsystems Inc.
java.vm.name = Java HotSpot(TM) 64-Bit Server VM
java.vm.specification.version = 1.0
java.specification.version = 1.6
os.arch = amd64
os.name = Linux
os.version = 2.6.9-55.ELsmp
Performance Table (The unit is MB/sec)
|| Num Bytes || CRC32 || PureJavaCrc32 || H7333 ||
| 1 | 7.408 | 105.707 | 91.593 |
| 2 | 14.466 | 166.277 | 148.577 |
| 4 | 27.889 | 195.146 | 202.841 |
| 8 | 50.039 | 248.775 | 248.598 |
| 16 | 84.972 | 329.594 | 335.569 |
| 32 | 130.063 | 394.096 | 403.355 |
| 64 | 178.457 | 436.276 | 448.967 |
| 128 | 217.633 | 461.647 | 475.962 |
| 256 | 245.799 | 474.164 | 490.574 |
| 512 | 262.378 | 482.075 | 498.107 |
| 1024 | 270.899 | 482.538 | 498.268 |
| 2048 | 276.426 | 486.091 | 502.043 |
| 4096 | 279.105 | 487.801 | 504.114 |
| 8192 | 278.792 | 488.647 | 504.989 |
| 16384 | 281.229 | 488.611 | 505.019 |
| 32768 | 281.292 | 485.426 | 501.475 |
| 65536 | 281.670 | 485.282 | 499.889 |
| 131072 | 280.752 | 483.555 | 499.573 |
| 262144 | 280.870 | 483.588 | 499.616 |
| 524288 | 280.863 | 483.364 | 499.629 |
| 1048576 | 280.913 | 483.524 | 499.278 |
| 2097152 | 280.552 | 483.234 | 499.226 |
| 4194304 | 280.172 | 481.426 | 497.313 |
| 8388608 | 278.749 | 477.359 | 492.912 |
| 16777216 | 278.458 | 476.276 | 492.101 |
h3. 2.1) Linux 2.6.18-53.1.13.el5 - Java 1.6.0_05-b13
java.version = 1.6.0_05
java.runtime.name = Java(TM) SE Runtime Environment
java.runtime.version = 1.6.0_05-b13
java.vm.version = 10.0-b19
java.vm.vendor = Sun Microsystems Inc.
java.vm.name = Java HotSpot(TM) Server VM
java.vm.specification.version = 1.0
java.specification.version = 1.6
os.arch = i386
os.name = Linux
os.version = 2.6.18-53.1.13.el5
Performance Table (The unit is MB/sec)
|| Num Bytes || CRC32 || PureJavaCrc32 || H7333 ||
| 1 | 9.705 | 111.072 | 111.346 |
| 2 | 18.016 | 138.367 | 137.494 |
| 4 | 34.638 | 213.342 | 226.139 |
| 8 | 62.939 | 221.194 | 239.386 |
| 16 | 107.019 | 289.123 | 307.963 |
| 32 | 164.449 | 368.337 | 378.862 |
| 64 | 224.498 | 425.263 | 423.380 |
| 128 | 275.259 | 463.062 | 449.381 |
| 256 | 309.731 | 485.426 | 465.417 |
| 512 | 331.076 | 497.375 | 473.646 |
| 1024 | 342.002 | 499.592 | 475.978 |
| 2048 | 349.365 | 504.120 | 479.042 |
| 4096 | 352.857 | 506.511 | 480.351 |
| 8192 | 353.812 | 507.538 | 474.044 |
| 16384 | 351.864 | 507.212 | 480.870 |
| 32768 | 351.206 | 499.854 | 479.526 |
| 65536 | 352.413 | 492.054 | 480.082 |
| 131072 | 348.713 | 499.692 | 480.194 |
| 262144 | 353.375 | 491.986 | 480.235 |
| 524288 | 353.703 | 499.682 | 480.275 |
| 1048576 | 353.387 | 499.606 | 480.259 |
| 2097152 | 352.382 | 491.653 | 480.062 |
| 4194304 | 352.810 | 499.002 | 479.758 |
| 8388608 | 351.093 | 493.473 | 475.056 |
| 16777216 | 350.968 | 492.484 | 473.960 |
h3. 2.2) Linux 2.6.18-53.1.13.el5 - Java 1.6.0_25-b06
java.version = 1.6.0_25
java.runtime.name = Java(TM) SE Runtime Environment
java.runtime.version = 1.6.0_25-b06
java.vm.version = 20.0-b11
java.vm.vendor = Sun Microsystems Inc.
java.vm.name = Java HotSpot(TM) 64-Bit Server VM
java.vm.specification.version = 1.0
java.specification.version = 1.6
os.arch = amd64
os.name = Linux
os.version = 2.6.18-53.1.13.el5
Performance Table (The unit is MB/sec)
|| Num Bytes || CRC32 || PureJavaCrc32 || H7333 ||
| 1 | 8.623 | 131.418 | 108.732 |
| 2 | 17.125 | 195.779 | 179.235 |
| 4 | 31.958 | 239.181 | 250.013 |
| 8 | 56.425 | 296.820 | 306.358 |
| 16 | 96.881 | 397.425 | 406.786 |
| 32 | 154.939 | 475.686 | 499.760 |
| 64 | 211.171 | 526.751 | 556.665 |
| 128 | 267.582 | 559.062 | 591.984 |
| 256 | 305.220 | 575.348 | 613.347 |
| 512 | 326.476 | 583.960 | 623.138 |
| 1024 | 338.115 | 585.560 | 624.430 |
| 2048 | 345.509 | 589.541 | 628.801 |
| 4096 | 349.223 | 591.533 | 630.996 |
| 8192 | 351.142 | 592.521 | 632.064 |
| 16384 | 352.057 | 592.771 | 628.888 |
| 32768 | 351.449 | 588.814 | 627.688 |
| 65536 | 352.759 | 588.454 | 627.405 |
| 131072 | 352.914 | 588.611 | 627.559 |
| 262144 | 352.735 | 588.665 | 627.592 |
| 524288 | 353.095 | 588.660 | 627.592 |
| 1048576 | 353.145 | 588.571 | 627.485 |
| 2097152 | 353.140 | 588.262 | 627.276 |
| 4194304 | 352.952 | 587.857 | 626.714 |
| 8388608 | 350.475 | 581.068 | 618.994 |
| 16777216 | 349.872 | 579.330 | 617.299 |
h3. 3.1) Windows XP - Java 1.6.0_23-b05
java.version = 1.6.0_23
java.runtime.name = Java(TM) SE Runtime Environment
java.runtime.version = 1.6.0_23-b05
java.vm.version = 19.0-b09
java.vm.vendor = Sun Microsystems Inc.
java.vm.name = Java HotSpot(TM) Client VM
java.vm.specification.version = 1.0
java.specification.version = 1.6
os.arch = x86
os.name = Windows XP
os.version = 5.1
Performance Table (The unit is MB/sec)
|| Num Bytes || CRC32 || PureJavaCrc32 || H7333 ||
| 1 | 4.276 | 62.669 | 64.448 |
| 2 | 7.978 | 102.239 | 109.401 |
| 4 | 16.730 | 132.354 | 147.217 |
| 8 | 31.439 | 182.390 | 195.586 |
| 16 | 54.836 | 219.618 | 234.732 |
| 32 | 87.627 | 238.995 | 270.211 |
| 64 | 129.170 | 253.900 | 304.123 |
| 128 | 170.240 | 278.521 | 321.418 |
| 256 | 211.477 | 286.423 | 341.665 |
| 512 | 265.165 | 304.382 | 279.309 |
| 1024 | 282.990 | 249.336 | 350.560 |
| 2048 | 243.438 | 307.147 | 353.229 |
| 4096 | 294.114 | 309.688 | 281.155 |
| 8192 | 301.212 | 280.119 | 334.366 |
| 16384 | 243.790 | 290.962 | 344.765 |
| 32768 | 292.988 | 295.948 | 272.812 |
| 65536 | 290.493 | 282.761 | 348.932 |
| 131072 | 250.337 | 301.592 | 352.376 |
| 262144 | 302.263 | 307.579 | 285.689 |
| 524288 | 304.144 | 306.469 | 331.565 |
| 1048576 | 289.816 | 205.993 | 350.036 |
| 2097152 | 295.914 | 289.658 | 285.215 |
| 4194304 | 270.211 | 300.434 | 295.595 |
| 8388608 | 236.343 | 276.725 | 339.176 |
| 16777216 | 291.456 | 295.043 | 311.783 |
h3. 3.2) Windows XP - Java 1.6.0_25-b06
java.version = 1.6.0_25
java.runtime.name = Java(TM) SE Runtime Environment
java.runtime.version = 1.6.0_25-b06
java.vm.version = 20.0-b11
java.vm.vendor = Sun Microsystems Inc.
java.vm.name = Java HotSpot(TM) Client VM
java.vm.specification.version = 1.0
java.specification.version = 1.6
os.arch = x86
os.name = Windows XP
os.version = 5.1
Performance Table (The unit is MB/sec)
|| Num Bytes || CRC32 || PureJavaCrc32 || H7333 ||
| 1 | 3.868 | 63.751 | 68.806 |
| 2 | 7.819 | 98.380 | 115.517 |
| 4 | 15.446 | 124.773 | 164.977 |
| 8 | 29.635 | 174.449 | 237.747 |
| 16 | 55.030 | 196.089 | 290.992 |
| 32 | 94.131 | 244.999 | 325.735 |
| 64 | 143.858 | 266.674 | 346.240 |
| 128 | 192.723 | 330.168 | 281.406 |
| 256 | 237.439 | 332.342 | 362.283 |
| 512 | 216.150 | 333.393 | 364.037 |
| 1024 | 273.192 | 337.913 | 282.942 |
| 2048 | 293.097 | 336.928 | 368.440 |
| 4096 | 300.675 | 260.551 | 368.051 |
| 8192 | 304.412 | 338.468 | 368.042 |
| 16384 | 232.549 | 338.616 | 367.385 |
| 32768 | 305.303 | 334.487 | 286.756 |
| 65536 | 306.435 | 331.820 | 356.131 |
| 131072 | 274.592 | 292.947 | 363.439 |
| 262144 | 304.008 | 336.345 | 285.965 |
| 524288 | 306.601 | 328.582 | 367.409 |
| 1048576 | 302.045 | 251.833 | 367.058 |
| 2097152 | 296.679 | 333.299 | 362.986 |
| 4194304 | 243.700 | 332.293 | 344.360 |
| 8388608 | 302.936 | 332.315 | 303.883 |
| 16777216 | 301.516 | 329.897 | 362.755 |
> Performance improvement in PureJavaCrc32
> ----------------------------------------
>
> Key: HADOOP-7333
> URL: https://issues.apache.org/jira/browse/HADOOP-7333
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 0.21.0
> Environment: Linux x64
> Reporter: Eric Caspole
> Assignee: Eric Caspole
> Priority: Minor
> Attachments: HADOOP-7333.patch
>
>
> I would like to propose a small patch to
> org.apache.hadoop.util.PureJavaCrc32.update(byte[] b, int off, int len)
> Currently the method stores the intermediate result back into the data member "crc."
I noticed this method gets
> inlined into DataChecksum.update() and that method appears as one of the hotter methods
in a simple hprof profile collected while running terasort and gridmix.
> If the code is modified to save the temporary result into a local and just once store
the final result back into the data member, it results in slightly more efficient hotspot
codegen.
> I tested this change using the the "org.apache.hadoop.util.TestPureJavaCrc32$PerformanceTest"
which is embedded in the existing unit test for this class, TestPureJavaCrc32 on a variety
of linux x64 AMD and Intel multi-socket and multi-core systems I have available to test.
> The patch removes several stores of the intermediate result to memory yielding a 0%-10%
speedup in the "org.apache.hadoop.util.TestPureJavaCrc32$PerformanceTest" which is embedded
in the existing unit test for this class, TestPureJavaCrc32.
>
> If you use a debug hotspot JVM with -XX:+PrintOptoAssembly, you can see the intermediate
stores such as:
> 414 movq R9, [rsp + #24] # spill
> 419 movl [R9 + #12 (8-bit)], RDX # int ! Field PureJavaCrc32.crc
> 41d xorl R10, RDX # int
> The patch results in just one final store of the fully computed value.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
|