hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3019) want input sampler & sorted partitioner
Date Mon, 15 Sep 2008 23:03:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12631175#action_12631175
] 

Chris Douglas commented on HADOOP-3019:
---------------------------------------

Results of test-patch with HADOOP-4151 applied:
{noformat}
     [exec] +1 overall.  

     [exec]     +1 @author.  The patch does not contain any @author tags.

     [exec]     +1 tests included.  The patch appears to include 18 new or modified tests.

     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.

     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler
warnings.

     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
{noformat}

> want input sampler & sorted partitioner
> ---------------------------------------
>
>                 Key: HADOOP-3019
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3019
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Doug Cutting
>            Assignee: Chris Douglas
>             Fix For: 0.19.0
>
>         Attachments: 3019-0.patch
>
>
> The input sampler should generate a small, random sample of the input, saved to a file.
> The partitioner should read the sample file and partition keys into relatively even-sized
key-ranges, where the partition numbers correspond to key order.
> Note that when the sampler is used for partitioning, the number of samples required is
proportional to the number of reduce partitions.  10x the intended reducer count should give
good results.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message