pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Viraj Bhat (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-2316) Incorrect results for FILTER *** BY ( *** OR ***) with FilterLogicExpressionSimplifier optimizer turned on
Date Mon, 10 Oct 2011 21:44:29 GMT

    [ https://issues.apache.org/jira/browse/PIG-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13124503#comment-13124503
] 

Viraj Bhat commented on PIG-2316:
---------------------------------

Thejas, I agree with your comments. I also agree that we should disable this optimization,
for the next releases of Pig, since we are getting wrong results without any warnings? At
the same time, what is the speedup in script runtime due to this rule? 
                
> Incorrect results for FILTER *** BY ( *** OR ***) with FilterLogicExpressionSimplifier
optimizer turned on
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2316
>                 URL: https://issues.apache.org/jira/browse/PIG-2316
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.0, 0.8.1, 0.9.0, 0.9.1
>            Reporter: Huanyu Zhao
>            Priority: Critical
>             Fix For: 0.8.1, 0.9.2
>
>         Attachments: pig-2316-trunk-v1.txt
>
>
> An example for this bug: 
> cat weird.txt
> 1,a
> 2,b
> 3,c
> When running pig with the following statements:
> A = LOAD 'weird.txt' using PigStorage(',') AS (col1:int,col2);
> B = FILTER A BY ((col1==1) OR (col1 != 1));
> DUMP B;
> I expect to get the result of all three rows back, but I receive only two rows.
> (2,b)
> (3,c)
> When we start pig with optimizer turning off.
> pig -optimizer_off All
> With optimizer turning off, we get the expected results and I get three rows for the
same statements.
> (1,a)
> (2,b)
> (3,c)
> --------------------------------------------------------
> This bug was test on: 
> pig-0.9.1, 
> pig-0.9.0, 
> pig-0.8.1, 
> pig-0.8.0
> All produced same incorrect results.
> --------------------------------------------------------
> When looked at the logical plan for this example, we found FilterlogicExpressionSimplifier
optimizer produced incorrect logical plan. So we guess the bug is caused by FilterlogicExpressionSimplifier
optimizer. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message