Return-Path: X-Original-To: apmail-pig-dev-archive@www.apache.org Delivered-To: apmail-pig-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CA8ED18B02 for ; Fri, 4 Dec 2015 02:05:11 +0000 (UTC) Received: (qmail 16706 invoked by uid 500); 4 Dec 2015 02:05:11 -0000 Delivered-To: apmail-pig-dev-archive@pig.apache.org Received: (qmail 16653 invoked by uid 500); 4 Dec 2015 02:05:11 -0000 Mailing-List: contact dev-help@pig.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@pig.apache.org Delivered-To: mailing list dev@pig.apache.org Received: (qmail 16297 invoked by uid 500); 4 Dec 2015 02:05:11 -0000 Delivered-To: apmail-hadoop-pig-dev@hadoop.apache.org Received: (qmail 16257 invoked by uid 99); 4 Dec 2015 02:05:11 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Dec 2015 02:05:11 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 476932C1F74 for ; Fri, 4 Dec 2015 02:05:11 +0000 (UTC) Date: Fri, 4 Dec 2015 02:05:11 +0000 (UTC) From: "liyunzhang_intel (JIRA)" To: pig-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (PIG-4675) FR+Limit case fails when enable MultiQuery because the predecessor information is wrongly calculated in current code. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/PIG-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyunzhang_intel updated PIG-4675: ---------------------------------- Summary: FR+Limit case fails when enable MultiQuery because the predecessor information is wrongly calculated in current code. (was: Multi Store Statement will fail on the second store statement.) > FR+Limit case fails when enable MultiQuery because the predecessor information is wrongly calculated in current code. > --------------------------------------------------------------------------------------------------------------------- > > Key: PIG-4675 > URL: https://issues.apache.org/jira/browse/PIG-4675 > Project: Pig > Issue Type: Sub-task > Components: spark > Reporter: Peter Lin > Assignee: liyunzhang_intel > Fix For: spark-branch > > Attachments: name.txt, ssn.txt, test.pig > > > We are testing the spark branch pig recently with mapr3 and spark 1.5. It turns out if we use more than 1 store command in the pig script will have exception from the second store command. > SSN = load '/test/ssn.txt' using PigStorage() as (ssn:long); > SSN_NAME = load '/test/name.txt' using PigStorage() as (ssn:long, name:chararray); > X = JOIN SSN by ssn LEFT OUTER, SSN_NAME by ssn USING 'replicated'; > R1 = limit SSN_NAME 10; > store R1 into '/tmp/test1_r1'; > store X into '/tmp/test1_x'; > Exception Details: > 15/09/11 13:37:00 INFO storage.MemoryStore: ensureFreeSpace(114448) called with curMem=359237, maxMem=503379394 > 15/09/11 13:37:00 INFO storage.MemoryStore: Block broadcast_2 stored as values in memory (estimated size 111.8 KB, free 479.6 MB) > 15/09/11 13:37:00 INFO storage.MemoryStore: ensureFreeSpace(32569) called with curMem=473685, maxMem=503379394 > 15/09/11 13:37:00 INFO storage.MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 31.8 KB, free 479.6 MB) > 15/09/11 13:37:00 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on 10.51.2.82:55960 (size: 31.8 KB, free: 479.9 MB) > 15/09/11 13:37:00 INFO spark.SparkContext: Created broadcast 2 from newAPIHadoopRDD at LoadConverter.java:88 > 15/09/11 13:37:00 WARN util.ClosureCleaner: Expected a closure; got org.apache.pig.backend.hadoop.executionengine.spark.converter.LoadConverter$ToTupleFunction > 15/09/11 13:37:00 INFO spark.SparkLauncher: Converting operator POForEach (Name: SSN: New For Each(false)[bag] - scope-17 Operator Key: scope-17) > 15/09/11 13:37:00 INFO spark.SparkLauncher: Converting operator POFRJoin (Name: X: FRJoin[tuple] - scope-22 Operator Key: scope-22) > 15/09/11 13:37:00 ERROR spark.SparkLauncher: throw exception in sparkOperToRDD: > java.lang.RuntimeException: Should have greater than1 predecessors for class org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin. Got : 1 > at org.apache.pig.backend.hadoop.executionengine.spark.SparkUtil.assertPredecessorSizeGreaterThan(SparkUtil.java:93) > at org.apache.pig.backend.hadoop.executionengine.spark.converter.FRJoinConverter.convert(FRJoinConverter.java:55) > at org.apache.pig.backend.hadoop.executionengine.spark.converter.FRJoinConverter.convert(FRJoinConverter.java:46) > at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.physicalToRDD(SparkLauncher.java:633) > at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.physicalToRDD(SparkLauncher.java:600) > at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.physicalToRDD(SparkLauncher.java:621) > at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.sparkOperToRDD(SparkLauncher.java:552) > at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.sparkPlanToRDD(SparkLauncher.java:501) > at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:204) > at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:301) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1390) > at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375) > at org.apache.pig.PigServer.execute(PigServer.java:1364) > at org.apache.pig.PigServer.executeBatch(PigServer.java:415) > at org.apache.pig.PigServer.executeBatch(PigServer.java:398) > at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:171) > at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:234) > at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81) > at org.apache.pig.Main.run(Main.java:624) > at org.apache.pig.Main.main(Main.java:170) -- This message was sent by Atlassian JIRA (v6.3.4#6332)