pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bill Graham (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-2779) Refactoring the code for setting number of reducers
Date Wed, 25 Jul 2012 22:21:35 GMT

    [ https://issues.apache.org/jira/browse/PIG-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422680#comment-13422680

Bill Graham commented on PIG-2779:

Great! We should probably set {{estimatedParallelism}} too, in case that's what was used.
If default parallel is used, it seems like that should show up already as {{default_parallel}}
actually, so we can omit that one. How about this:


I'm prefixing with 'info' to denote that these fields are not accepted as input. Instead they
are produced as output for debugging and analysis. I'll send an email to pig-dev about the
suggested syntax.

> Refactoring the code for setting number of reducers
> ---------------------------------------------------
>                 Key: PIG-2779
>                 URL: https://issues.apache.org/jira/browse/PIG-2779
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Jie Li
>            Assignee: Jie Li
>             Fix For: 0.11
>         Attachments: PIG-2779.0.patch, PIG-2779.1.patch, PIG-2779.2.patch, TestNumberOfReducers.java,
> As PIG-2652 observed, currently the code for setting number of reducers is a little messy.
MapReduceOper.requestedParallelism seems being misused in some plases, and now we support
runtime estimation of #reducer which further complicates the problem.
> For example, if we specify parallel 1 for the order-by, the estimated #reducer will be
used. If we specify parallel 2 while it estimates 4, order-by will fail due to "Illegal partition
for Null". If we specify parallel 4 while it estimates 2, then some reducers will have nothing
to do. 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message