hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sukhendu Chakraborty <sukhendu.chakrabo...@gmail.com>
Subject SMB join bug
Date Fri, 02 May 2014 02:10:42 GMT
I am seeing very different number of rows in this query output depending on
whether I enable SMB join:

select count(*)
from dss.hist_hshld_profl_mc  a
          join
          dss.hshld_summary_mc     b
       on a.hh_key = b.hh_key
 where ('2012-02-27' between a.hshld_profl_eff_dt and a.hshld_profl_exp_dt)
      and a.hshld_exp_dt='9999-12-31'
   and trim(a.cntry_id) = 'USA'

The SMB join returns 60 rows (wrong value) while the regular join returns
30million plus rows (correct value).

Is there a known issue/jira for this? We are using CDH5.0/hive-0.12.

-Sukhendu

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message