hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicholas Brenwald (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-11603) IndexOutOfBoundsException thrown when accessing a union all subquery and filtering on a column which does not exist in all underlying tables
Date Wed, 19 Aug 2015 12:23:45 GMT
Nicholas Brenwald created HIVE-11603:
----------------------------------------

             Summary: IndexOutOfBoundsException thrown when accessing a union all subquery
and filtering on a column which does not exist in all underlying tables
                 Key: HIVE-11603
                 URL: https://issues.apache.org/jira/browse/HIVE-11603
             Project: Hive
          Issue Type: Bug
    Affects Versions: 1.3.0
         Environment: Hadoop 2.6
            Reporter: Nicholas Brenwald
            Priority: Minor
             Fix For: 2.0.0


Create two empty tables t1 and t2
{code}
CREATE TABLE t1(c1 STRING);
CREATE TABLE t2(c1 STRING, c2 INT);
{code}

Create a view on these two tables
{code}
CREATE VIEW v1 AS 
SELECT c1, c2 
FROM (
    SELECT c1, CAST(NULL AS INT) AS c2 FROM t1
    UNION ALL
    SELECT c1, c2 FROM t2
) x;
{code}

Then run
{code}
SELECT COUNT(*) from v1 
WHERE c2 = 0;
{code}

We expect to get a result of zero, but instead the query fails with stack trace:
{code}
Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
	at java.util.ArrayList.rangeCheck(ArrayList.java:635)
	at java.util.ArrayList.get(ArrayList.java:411)
	at org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
	at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
	at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
	at org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:442)
	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:119)
	... 22 more
{code}

Workarounds include disabling ppd,
{code}
set hive.optimize.ppd=false;
{code}
Or changing the view so that column c2 is null cast to double:
{code}
CREATE VIEW v1_workaround AS 
SELECT c1, c2 
FROM (
    SELECT c1, CAST(NULL AS DOUBLE) AS c2 FROM t1
    UNION ALL
    SELECT c1, c2 FROM t2
) x;
{code}

The problem seems to occur in branch-1.1, branch-1.2, branch-1 but seems to be resolved in
master (2.0.0)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message