hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pradeep Kamath (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1035) support for skewed outer join
Date Fri, 30 Oct 2009 21:15:59 GMT

    [ https://issues.apache.org/jira/browse/PIG-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12772095#action_12772095
] 

Pradeep Kamath commented on PIG-1035:
-------------------------------------

The unit test does not seem to check the results of the outer join - would be good to add
check of the actual results. 
In fact, there are already outer join tests in TestJoin.java - you can just update those to
also test skewed join since those tests
already check output correctness.

In LogToPhyTranslationVisitor.java, in the following code, the return value of op.getSchema()
should be checked for null in
which case the same Exception should be thrown:
{code}
 849                 try {
   850                     skj.addSchema(op.getSchema());
   851                 } catch (FrontendException e) {
   852                     int errCode = 2015;
   853                     String msg = "Couldn't set the schema for outer join" ;
   854                     throw new LogicalToPhysicalTranslatorException(msg, errCode, PigException.BUG,
e);
   855                 }
{code}
With the above code, schema is required for both inputs to the join. Strictly, for left and
right outer joins, only the
schema of the side where nulls need to be projected is needed. Only in full outer join both
inputs should have schemas - if possible
for left and right outer joins the restriction should be to require a schema only on the relevant
input - for reference - left and right outer
joins  in regular join do this.

> support for skewed outer join
> -----------------------------
>
>                 Key: PIG-1035
>                 URL: https://issues.apache.org/jira/browse/PIG-1035
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>            Assignee: Sriranjan Manjunath
>         Attachments: 1035.patch
>
>
> Similarly to skewed inner join, skewed outer join will help to scale in the presense
of join keys that don't fit into memory

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message