hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Runping Qi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-531) Need to sort on more than the primary key
Date Thu, 14 Sep 2006 04:11:23 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-531?page=comments#action_12434587 ] 
Runping Qi commented on HADOOP-531:

>Owen O'Malley [13/Sep/06 07:07 PM] I think you meant HADOOP-485, which is about having
a different comparator for determining equality in grouping for the reduce input. 


> Need to sort on more than the primary key
> -----------------------------------------
>                 Key: HADOOP-531
>                 URL: http://issues.apache.org/jira/browse/HADOOP-531
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/streaming
>    Affects Versions: 0.5.0
>            Reporter: Richard Kasperski
> There are many tasks where I need to have finer control over the ordering in the reduce
than a sort on a single key provides. Most of these situations arise when a merge two sources
of data and am attaching a single instance of one source to multiple instances of a second
source. I know that I can read all the the records with a single key. It's possible that there
might be many millions of these making memory demands that cannot be satisfied.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message