hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
Date Sat, 13 Oct 2012 09:03:04 GMT
Gopal V created HADOOP-8926:

             Summary: hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
                 Key: HADOOP-8926
                 URL: https://issues.apache.org/jira/browse/HADOOP-8926
             Project: Hadoop Common
          Issue Type: Improvement
          Components: util
    Affects Versions: 2.0.3-alpha
         Environment: Ubuntu 10.10 i386 
            Reporter: Gopal V
            Priority: Trivial

While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction
was consumed by the DataChecksum.update(). 

The attached patch converts the static arrays in CRC32 into a single linear array for a performance
boost in the inner loop.

milli-seconds for 1Gig (16400 loop over a 64kb chunk) 

|| platform || original || cache-aware || improvement ||
| x86 | 3894 | 2304 | 40.83 |
| x86_64 | 2131 | 1826 | 14 | 

The performance improvement on x86 is rather larger than the 64bit case, due to the extra
register/stack pressure caused by the static arrays.

A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment

  0x40f1e345: mov    $0x184,%ecx
  0x40f1e34a: mov    0x4415b560(%ecx),%ecx  ;*getstatic T8_5
                                        ; - PureJavaCrc32::update@95 (line 61)
                                        ;   {oop('PureJavaCrc32')}
  0x40f1e350: mov    %ecx,0x2c(%esp)

Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because
of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code
due to the increased number of registers.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message