hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-3125) We should reuse key and value objects in the MultithreadedMapRunner.
Date Fri, 28 Mar 2008 21:34:24 GMT
We should reuse key and value objects in the MultithreadedMapRunner.
--------------------------------------------------------------------

                 Key: HADOOP-3125
                 URL: https://issues.apache.org/jira/browse/HADOOP-3125
             Project: Hadoop Core
          Issue Type: Improvement
          Components: mapred
            Reporter: Owen O'Malley


Currently, each key/value pair read from the record reader is allocated a new a key and value.
It would be better if it had a pool of key/value pairs that were reused. I'm picturing something
like:

BlockingQueue<KeyValuePair> empties;
BlockingQueue<KeyValuePair> newInputs;

the record reader thread would take a KeyValuePair from the empties queue, read into it using
the RecordReader, and put it on the newInputs queue.

The work threads would read from newInputs, process the key and value and put the processed
objects on the empties queue. The initialization would put the desired number of key-value
pairs on the empties queue to start it off.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message