hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shravan Matthur Narayanamurthy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-615) Wrong number of jobs with limit
Date Mon, 19 Jan 2009 09:34:00 GMT

    [ https://issues.apache.org/jira/browse/PIG-615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665077#action_12665077

Shravan Matthur Narayanamurthy commented on PIG-615:

Should I submit the changes I suggested in our discussion as a patch?

A summary of the discussion follows:
As per the current logic, the generation of the 4th MR Job in case of limit depends on the
use of *"parallel"* keyword.
Though the logic is not directly dependent on cluster configuration, some cluster configs
require this 4th MR job and some don't.
For ex., if the cluster is configured to set number of reducers to one if parallelism is -1
or unspecified then our current logic will work as the 4th MR Job is redundant. 
However, if the cluster is configured to set number of reducers to some other number, like
0.9 times the number of reduce slots if parallelism is unspecified then the 4th MRJob is necessary.

That is, we are making an implicit assumption in the code that if parallel is not explicitly
mentioned, then the number of reducers is equal to 1. So the logic needs to be changed to
include the 4th MRJob whenever the parallelism is not explicitly set to 1 so that we will
produce correct results though in some cases its use might be redundant.

> Wrong number of jobs with limit
> -------------------------------
>                 Key: PIG-615
>                 URL: https://issues.apache.org/jira/browse/PIG-615
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Olga Natkovich
>            Assignee: Shravan Matthur Narayanamurthy

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message