hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ruibang He (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1248) Redundant memory copying in StreamKeyValUtil
Date Tue, 01 Dec 2009 07:33:22 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784098#action_12784098

Ruibang He commented on MAPREDUCE-1248:

Thanks, Guanyin. The lastest trunk has fixed the problem in KeyValueLineRecordReader.java,
but in StreamKeyValUtil.java this problem still exists. Patch is attached for an early solution.

> Redundant memory copying in StreamKeyValUtil
> --------------------------------------------
>                 Key: MAPREDUCE-1248
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1248
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/streaming
>            Reporter: Ruibang He
>            Priority: Minor
> I found that when MROutputThread collecting the output of  Reducer, it calls StreamKeyValUtil.splitKeyVal()
and two local byte-arrays are allocated there for each line of output. Later these two byte-arrays
are passed to variable key and val. There are twice memory copying here, one is the System.arraycopy()
method, the other is inside key.set() / val.set().
> This causes double times of memory copying for the whole output (may lead to higher CPU
consumption), and frequent temporay object allocation.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message