hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vineet Garg <vg...@hortonworks.com>
Subject Re: hive 3.1 mapjoin with complex predicate produce incorrect results
Date Fri, 21 Dec 2018 17:38:52 GMT
Hi Andrey,

I tried this on latest master and wasn’t able to reproduce. Would you mind sharing explain
plan output? (after setting hive.user.explain = false).

Vineet

> On Dec 20, 2018, at 11:37 AM, Andrey Zinovyev <andrey.zinovyev@gmail.com> wrote:
> 
> Hi,
> We stumbled on some weird behaviour of mapjoin in hive 3.1 
> Sample schema:
> > create table table_data(key int, a int);
> > insert into table_data values (1, 1), (2, 2), (1, 3), (2, 4), (3, 5);
> > create table table_dict(key int, b int);
> > insert into table_dict values (1, 42), (2, 43);
> 
> Query:
> >SELECT xs.key, dict.key, dict.b
> >FROM table_data as xs
> >LEFT JOIN table_dict as dict ON if((xs.key is null) or (xs.key = ''), 44, xs.key)
= dict.key;
> 
> returns wrong result when hive.auto.convert.join=true;
> +---------+-----------+---------+
> | xs.key  | dict.key  | dict.b  |
> +---------+-----------+---------+
> | 1       | 1         | 42      |
> | 2       | 1         | 43      |
> | 1       | 1         | 42      |
> | 2       | 1         | 43      |
> | 3       | 1         | NULL    |
> +---------+-----------+---------
> 
> xs.key != dict.key (but they should be cause I join on them) while dict.b values are
right
> 
> when hive.auto.convert.join=false results are currect
> +---------+-----------+---------+
> | xs.key  | dict.key  | dict.b  |
> +---------+-----------+---------+
> | 1       | 1         | 42      |
> | 1       | 1         | 42      |
> | 2       | 2         | 43      |
> | 2       | 2         | 43      |
> | 3       | NULL      | NULL    |
> +---------+-----------+---------+
> 
> It is definitely caused by if expression in ON. 
> 
> 
> -- 
> Andrey

Mime
View raw message