hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Joseph Evans (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4274) MapOutputBuffer should use native byte order for kvmeta
Date Mon, 21 May 2012 15:40:41 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280223#comment-13280223
] 

Robert Joseph Evans commented on MAPREDUCE-4274:
------------------------------------------------

I am not an expert on this code.  I dug into it, but the code is somewhat complex, so I want
to check one thing with you first before giving it a +1.  kvmeta is a wrapper around kvbuffer,
but it only is used to store the offsets into kvbuffer where the data is stored not the keys
and values themselves.  Those are written into kvbuffer bypassing kvmeta.  So, even though
kvbuffer is handed directly to the user supplied RawComparator code, the bytes in between
the offsets given to RawComparator are the same as they were before the change.  Is this analysis
correct?
                
> MapOutputBuffer should use native byte order for kvmeta
> -------------------------------------------------------
>
>                 Key: MAPREDUCE-4274
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4274
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: performance, task
>    Affects Versions: 2.0.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Minor
>         Attachments: mapreduce-4274.txt
>
>
> I don't have a benchmark to support this, but this should give a small CPU improvement
on the map output buffer: currently, we create {{kvmeta}} as {{ByteBuffer.wrap(kvbuffer).asIntBuffer()}}.
According to the javadocs, the resulting int buffer will inherit its byte order from the ByteBuffer
it comes from, and the byte buffer defaults to BIG_ENDIAN. Thus, all of our int access to/from
the buffer will require byte-swapping.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message