Return-Path: X-Original-To: apmail-pig-dev-archive@www.apache.org Delivered-To: apmail-pig-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BD39D5F63 for ; Fri, 13 May 2011 00:49:29 +0000 (UTC) Received: (qmail 48325 invoked by uid 500); 13 May 2011 00:49:29 -0000 Delivered-To: apmail-pig-dev-archive@pig.apache.org Received: (qmail 48301 invoked by uid 500); 13 May 2011 00:49:29 -0000 Mailing-List: contact dev-help@pig.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@pig.apache.org Delivered-To: mailing list dev@pig.apache.org Received: (qmail 48293 invoked by uid 500); 13 May 2011 00:49:29 -0000 Delivered-To: apmail-hadoop-pig-dev@hadoop.apache.org Received: (qmail 48290 invoked by uid 99); 13 May 2011 00:49:29 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 May 2011 00:49:29 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 May 2011 00:49:27 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 7690C888E4 for ; Fri, 13 May 2011 00:48:47 +0000 (UTC) Date: Fri, 13 May 2011 00:48:47 +0000 (UTC) From: "Daniel Dai (JIRA)" To: pig-dev@hadoop.apache.org Message-ID: <1655031337.8760.1305247727482.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1774297321.8420.1305239507823.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (PIG-2067) FilterLogicExpressionSimplifier removed some branches in some cases MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/PIG-2067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032774#comment-13032774 ] Daniel Dai commented on PIG-2067: --------------------------------- This issue happens when: 1. We have AND in filter plan 2. Two branch of AND is the same UDF, but the input for the UDF is different LogicExpressionSimplifier will erroneously believe two branches are the same and merge them. > FilterLogicExpressionSimplifier removed some branches in some cases > ------------------------------------------------------------------- > > Key: PIG-2067 > URL: https://issues.apache.org/jira/browse/PIG-2067 > Project: Pig > Issue Type: Bug > Components: impl > Affects Versions: 0.8.1, 0.9.0 > Reporter: Daniel Dai > Assignee: Daniel Dai > Fix For: 0.8.1, 0.9.0 > > Attachments: PIG-2067-1.patch > > > The following script produce wrong result: > {code} > A = load 'a.dat' as (cookie); > B = load 'b.dat' as (cookie); > C = cogroup A by cookie, B by cookie; > E = filter C by COUNT(B)>0 AND COUNT(A)>0; > explain E; > {code} > a.dat: > 1 1 > 2 2 > 3 3 > 4 4 > 5 5 > 6 6 > 7 7 > b.dat: > 3 3 > 4 4 > 5 5 > 6 6 > 7 7 > 8 8 > Expected output: > (3,{(3)},{(3)}) > (4,{(4)},{(4)}) > (5,{(5)},{(5)}) > (6,{(6)},{(6)}) > (7,{(7)},{(7)}) > We get: > (3,{(3)},{(3)}) > (4,{(4)},{(4)}) > (5,{(5)},{(5)}) > (6,{(6)},{(6)}) > (7,{(7)},{(7)}) > (8,{},{(8)}) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira