hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alain Schröder (JIRA) <j...@apache.org>
Subject [jira] [Updated] (HIVE-10083) SMBJoin fails in case one table is uninitialized
Date Wed, 25 Mar 2015 15:42:53 GMT

     [ https://issues.apache.org/jira/browse/HIVE-10083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alain Schröder updated HIVE-10083:
----------------------------------
    Affects Version/s:     (was: 0.13.1)
                       0.13.0

> SMBJoin fails in case one table is uninitialized
> ------------------------------------------------
>
>                 Key: HIVE-10083
>                 URL: https://issues.apache.org/jira/browse/HIVE-10083
>             Project: Hive
>          Issue Type: Bug
>          Components: Logical Optimizer
>    Affects Versions: 0.13.0
>         Environment: MapR Hive 0.13
>            Reporter: Alain Schröder
>            Priority: Minor
>
> We experience IndexOutOfBoundsException in a SMBJoin in the case on the tables used for
the JOIN is uninitialized. Everything works if both are uninitialized or initialized.
> {code}
> 2015-03-24 09:12:58,967 ERROR [main]: ql.Driver (SessionState.java:printError(545)) -
FAILED: IndexOutOfBoundsException Index: 0, Size: 0
> java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
>         at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>         at java.util.ArrayList.get(ArrayList.java:411)
>         at org.apache.hadoop.hive.ql.optimizer.AbstractBucketJoinProc.fillMappingBigTableBucketFileNameToSmallTableBucketFileNames(AbstractBucketJoinProc.java:486)
>         at org.apache.hadoop.hive.ql.optimizer.AbstractBucketJoinProc.convertMapJoinToBucketMapJoin(AbstractBucketJoinProc.java:429)
>         at org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.convertJoinToBucketMapJoin(AbstractSMBJoinProc.java:540)
>         at org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.convertJoinToSMBJoin(AbstractSMBJoinProc.java:549)
>         at org.apache.hadoop.hive.ql.optimizer.SortedMergeJoinProc.process(SortedMergeJoinProc.java:51)
> {code}
> Simplest way to reproduce:
> {code}
> SET hive.enforce.sorting=true;
> SET hive.enforce.bucketing=true;
> SET hive.exec.dynamic.partition=true;
> SET mapreduce.reduce.import.limit=-1;
> SET hive.optimize.bucketmapjoin=true;
> SET hive.optimize.bucketmapjoin.sortedmerge=true;
> SET hive.auto.convert.join=true;
> SET hive.auto.convert.sortmerge.join=true;
> SET hive.auto.convert.sortmerge.join.noconditionaltask=true;
> CREATE DATABASE IF NOT EXISTS tmp;
> USE tmp;
> CREATE  TABLE `test1` (
>   `foo` bigint )
> CLUSTERED BY (
>   foo)
> SORTED BY (
>   foo ASC)
> INTO 384 BUCKETS
> stored as orc;
> CREATE  TABLE `test2`(
>   `foo` bigint )
> CLUSTERED BY (
>   foo)
> SORTED BY (
>   foo ASC)
> INTO 384 BUCKETS
> STORED AS ORC;
> -- Initialize ONE table of the two tables with any data.
> INSERT INTO TABLE test1 SELECT foo FROM table_with_some_content LIMIT 100;
> SELECT t1.foo, t2.foo
> FROM test1 t1 INNER JOIN test2 t2 
> ON (t1.foo = t2.foo);
> {code}
> I took a look at the Procedure fillMappingBigTableBucketFileNameToSmallTableBucketFileNames
in AbstractBucketJoinProc.java and it does not seem to have changed from our MapR Hive 0.13
to current snapshot, so this should be also an error in the current Version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message