hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hal Mo (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3235) Improve CPU cache behavior in map side sort
Date Sat, 16 Jun 2012 08:21:45 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13296127#comment-13296127
] 

Hal Mo commented on MAPREDUCE-3235:
-----------------------------------

@Todd
bq.  How did the wall clock differ?
Almost the same, depends on if there are rack local maps. the more rack local maps, the slower.

BTW, I know you are interested in JVM intrinsic binary array compare. Our team might implement
it, if we can see enough requirement. We work on JVM to support hadoop. For example, our GCIH
share side data between maps(share java object directly between JVMs), saving 2000G physical
memory on 200 nodes, and at the same time improve GC performance/job performance, for one
of our critical daily job.
                
> Improve CPU cache behavior in map side sort
> -------------------------------------------
>
>                 Key: MAPREDUCE-3235
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3235
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: performance, task
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: map_sort_perf.diff, mr-3235-poc.txt
>
>
> When running oprofile on a terasort workload, I noticed that a large amount of CPU usage
was going to MapTask$MapOutputBuffer.compare. Upon disassembling this and looking at cycle
counters, most of the cycles were going to memory loads dereferencing into the array of key-value
data -- implying expensive cache misses. This can be avoided as follows:
> - rather than simply swapping indexes into the kv array, swap the entire meta entries
in the meta array. Swapping 16 bytes is only negligibly slower than swapping 4 bytes. This
requires adding the value-length into the meta array, since we used to rely on the previous-in-the-array
meta entry to determine this. So we replace INDEX with VALUELEN and avoid one layer of indirection.
> - introduce an interface which allows key types to provide a 4-byte comparison proxy.
For string keys, this can simply be the first 4 bytes of the string. The idea is that, if
stringCompare(key1.proxy(), key2.proxy()) != 0, then compare(key1, key2) should have the same
result. If the proxies are equal, the normal comparison method is used. We then include the
4-byte proxy as part of the metadata entry, so that for many cases the indirection into the
data buffer can be avoided.
> On a terasort benchmark, these optimizations plus an optimization to WritableComparator.compareBytes
dropped the aggregate mapside CPU millis by 40%, and the compare() routine mostly dropped
off the oprofile results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message