hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thejas Nair <the...@hortonworks.com>
Subject Re: SMB join bug
Date Fri, 02 May 2014 18:37:34 GMT
It is possible that you hit this issue  -
https://issues.apache.org/jira/browse/HIVE-5973
It is fixed in apache hive 0.13 release.


On Thu, May 1, 2014 at 7:10 PM, Sukhendu Chakraborty
<sukhendu.chakraborty@gmail.com> wrote:
> I am seeing very different number of rows in this query output depending on
> whether I enable SMB join:
>
> select count(*)
> from dss.hist_hshld_profl_mc  a
>           join
>           dss.hshld_summary_mc     b
>        on a.hh_key = b.hh_key
>  where ('2012-02-27' between a.hshld_profl_eff_dt and a.hshld_profl_exp_dt)
>       and a.hshld_exp_dt='9999-12-31'
>    and trim(a.cntry_id) = 'USA'
>
> The SMB join returns 60 rows (wrong value) while the regular join returns
> 30million plus rows (correct value).
>
> Is there a known issue/jira for this? We are using CDH5.0/hive-0.12.
>
> -Sukhendu

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Mime
View raw message