hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ruibang He (JIRA)" <j...@apache.org>
Subject [jira] Created: (MAPREDUCE-1248) Redundant memory copying in StreamKeyValUtil
Date Mon, 30 Nov 2009 09:47:20 GMT
Redundant memory copying in StreamKeyValUtil

                 Key: MAPREDUCE-1248
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1248
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: contrib/streaming
            Reporter: Ruibang He
            Priority: Minor

I found that when MROutputThread collecting the output of  Reducer, it calls StreamKeyValUtil.splitKeyVal()
and two local byte-arrays are allocated there for each line of output. Later these two byte-arrays
are passed to variable key and val. There are twice memory copying here, one is the System.arraycopy()
method, the other is inside key.set() / val.set().

This causes double times of memory copying for the whole output (may lead to higher CPU consumption),
and frequent temporay object allocation.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message