pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Dai (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-4541) Skewed full outer join does not return records if any relation is empty. Outer join does not return any record if left relation is empty
Date Tue, 26 May 2015 22:09:17 GMT

    [ https://issues.apache.org/jira/browse/PIG-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14559971#comment-14559971
] 

Daniel Dai commented on PIG-4541:
---------------------------------

[~dghosal], I can see the skewed join issue when left relation is empty, but I cannot reproduce
the regular join issue in any version. Do you have a specific script?

> Skewed full outer join does not return records if any relation is empty. Outer join does
not return any record if left relation is empty
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-4541
>                 URL: https://issues.apache.org/jira/browse/PIG-4541
>             Project: Pig
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.14.0
>         Environment: HDP 2.2.4
>            Reporter: Dipankar
>            Assignee: Daniel Dai
>             Fix For: 0.15.0
>
>         Attachments: PIG-4541-1.patch
>
>
> Test1:
> Perform full join on two relation with left relation being blank and right containing
records
> empty_relation = FILTER a_relation by (join_column=='eliminate everything');
> Test_output = JOIN empty_relation by (join_column) FULL , non_empty_relation by (join_column);
> Result : Zero records returned.
> Test2:
> Perform full join on two relation with left relation being blank and right containing
records using skewed
> Test_output = JOIN empty_relation by (join_column) FULL , non_empty_relation by (join_column)
using ‘skewed’;
> Result : Zero records returned.
> Test3:
> Perform full join on two relation with left relation being blank and right containing
records using parallel
> Test_output = JOIN empty_relation by (join_column) FULL , non_empty_relation by (join_column)
PARALLEL 10;
> Result : Zero records returned.
> Test4:
> Perform full join on two relation with left relation being non empty  and right not containing
records using parallel
> Test_output = JOIN , non_empty_relation by (join_column) FULL , empty_relation by (join_column)
PARALLEL 10;
> Result : valid records  returned.
> Observation:
> 1) If the either relation is blank , skewed full outer join does not return anything
> 2) If the non empty relation is kept on left, everything works except skewed
> 3) FULL OUTER will only work if the left relation is not empty
> 4) Skewed will only work if both relation is non empty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message