hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Richard Kasperski (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-531) Need to sort on more than the primary key
Date Wed, 13 Sep 2006 21:45:22 GMT
Need to sort on more than the primary key

                 Key: HADOOP-531
                 URL: http://issues.apache.org/jira/browse/HADOOP-531
             Project: Hadoop
          Issue Type: Improvement
          Components: contrib/streaming
    Affects Versions: 0.5.0
            Reporter: Richard Kasperski

There are many tasks where I need to have finer control over the ordering in the reduce than
a sort on a single key provides. Most of these situations arise when a merge two sources of
data and am attaching a single instance of one source to multiple instances of a second source.
I know that I can read all the the records with a single key. It's possible that there might
be many millions of these making memory demands that cannot be satisfied.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message