hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Runping Qi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-531) Need to sort on more than the primary key
Date Fri, 15 Sep 2006 00:43:23 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-531?page=comments#action_12434863 ] 
Runping Qi commented on HADOOP-531:

Hadoop-531 captures a composite problem. To address problem, we need:

1. Allow different mappers for data from different input sources (HADOOP-372)
2. Allow to sort the map out data according to multiple keys (HADOOP-485)
3. Allow to clone reduce value iterators (HADOOP-475)

> Need to sort on more than the primary key
> -----------------------------------------
>                 Key: HADOOP-531
>                 URL: http://issues.apache.org/jira/browse/HADOOP-531
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/streaming
>    Affects Versions: 0.5.0
>            Reporter: Richard Kasperski
> There are many tasks where I need to have finer control over the ordering in the reduce
than a sort on a single key provides. Most of these situations arise when a merge two sources
of data and am attaching a single instance of one source to multiple instances of a second
source. I know that I can read all the the records with a single key. It's possible that there
might be many millions of these making memory demands that cannot be satisfied.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message