hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ruibang He (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1248) Redundant memory copying in StreamKeyValUtil
Date Wed, 21 Jul 2010 03:10:52 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12890545#action_12890545

Ruibang He commented on MAPREDUCE-1248:

You're welcome, Amareshwari :-)

> Redundant memory copying in StreamKeyValUtil
> --------------------------------------------
>                 Key: MAPREDUCE-1248
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1248
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/streaming
>            Reporter: Ruibang He
>            Priority: Minor
>             Fix For: 0.22.0
>         Attachments: MAPREDUCE-1248-v1.0.patch
> I found that when MROutputThread collecting the output of  Reducer, it calls StreamKeyValUtil.splitKeyVal()
and two local byte-arrays are allocated there for each line of output. Later these two byte-arrays
are passed to variable key and val. There are twice memory copying here, one is the System.arraycopy()
method, the other is inside key.set() / val.set().
> This causes double times of memory copying for the whole output (may lead to higher CPU
consumption), and frequent temporay object allocation.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message