Return-Path: X-Original-To: apmail-drill-dev-archive@www.apache.org Delivered-To: apmail-drill-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7F97010503 for ; Thu, 29 Jan 2015 01:06:34 +0000 (UTC) Received: (qmail 59911 invoked by uid 500); 29 Jan 2015 01:06:34 -0000 Delivered-To: apmail-drill-dev-archive@drill.apache.org Received: (qmail 59850 invoked by uid 500); 29 Jan 2015 01:06:34 -0000 Mailing-List: contact dev-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list dev@drill.apache.org Received: (qmail 59801 invoked by uid 99); 29 Jan 2015 01:06:34 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Jan 2015 01:06:34 +0000 Date: Thu, 29 Jan 2015 01:06:34 +0000 (UTC) From: "Jinfeng Ni (JIRA)" To: dev@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (DRILL-2107) Hash Join throw IOBE for a query with exists subquery. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Jinfeng Ni created DRILL-2107: --------------------------------- Summary: Hash Join throw IOBE for a query with exists subquery. Key: DRILL-2107 URL: https://issues.apache.org/jira/browse/DRILL-2107 Project: Apache Drill Issue Type: New Feature Components: Execution - Operators Reporter: Jinfeng Ni Assignee: Chris Westin Priority: Critical I hit an IOBE for TestTpchDistributed Q4, when I tried to enable an optimizer rule. Then, I simplified Q4 to the following, and still re-produce the same IOBE. {code} select o.o_orderpriority from cp.`tpch/orders.parquet` o where exists ( select * from cp.`tpch/lineitem.parquet` l where l.l_orderkey = o.o_orderkey ) ; {code} Stack trace of the exception: {code} java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 at java.util.ArrayList.rangeCheck(ArrayList.java:635) ~[na:1.7.0_45] at java.util.ArrayList.get(ArrayList.java:411) ~[na:1.7.0_45] at org.apache.drill.exec.record.VectorContainer.getValueAccessorById(VectorContainer.java:232) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] at org.apache.drill.exec.record.RecordBatchLoader.getValueAccessorById(RecordBatchLoader.java:149) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.unorderedreceiver.UnorderedReceiverBatch.getValueAccessorById(UnorderedReceiverBatch.java:132) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] at org.apache.drill.exec.test.generated.HashTableGen307.doSetup(HashTableTemplate.java:71) ~[na:na] at org.apache.drill.exec.test.generated.HashTableGen307.updateBatches(HashTableTemplate.java:473) ~[na:na] at org.apache.drill.exec.test.generated.HashJoinProbeGen313.executeProbePhase(HashJoinProbeTemplate.java:139) ~[na:na] at org.apache.drill.exec.test.generated.HashJoinProbeGen313.probeAndProject(HashJoinProbeTemplate.java:223) ~[na:na] at org.apache.drill.exec.physical.impl.join.HashJoinBatch.innerNext(HashJoinBatch.java:227) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] .... {code} The physical plan seems to be correct, after enabling the new rule. Actually, if I disable HashJoin, and use merge join for the query, it works fine. So, seems the IOBE exposes some bug in HashJoin. To re-produce this issue, two options: 1 ) - Modify DrillRuleSets.java, remove the comment before SwapJoinRule - alter session set `planner.slice_target` = 10; - run the query 2) use the attached physical plan in json file, and use "submitplan" to submit the physical plan. For comparison, I also attached the physical plan when disabling hashjoin (use merge join), and the explain plan at physical operator level. -- This message was sent by Atlassian JIRA (v6.3.4#6332)