hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Kudlej <mkud...@redhat.com>
Subject full outer join result
Date Wed, 12 Mar 2014 13:47:12 GMT
Hi all,

I've tried BigTop test for join_filters:
CREATE TABLE myinput1(key int, value int);
LOAD DATA LOCAL INPATH 'seed_data_files/in3.txt' INTO TABLE myinput1;

where seed_data_files/in3.txt:
12      35
NULL    40
48      NULL
100     100

I've tried:
SELECT * FROM myinput1 a FULL OUTER JOIN myinput1 b on a.key > 40 AND a.value > 50 AND
a.key = 
a.value AND b.key > 40 AND b.value > 50 AND b.key = b.value ORDER BY a.key, a.value,
b.key, b.value;

and expected result in test is:
NULL  NULL  NULL  40
NULL  NULL  NULL  40
NULL  NULL  NULL  40
NULL  NULL  NULL  40
NULL  NULL  12  35
NULL  NULL  12  35
NULL  NULL  12  35
NULL  NULL  12  35
NULL  NULL  48  NULL
NULL  NULL  48  NULL
NULL  NULL  48  NULL
NULL  NULL  48  NULL
NULL  40  NULL  NULL
NULL  40  NULL  NULL
NULL  40  NULL  NULL
NULL  40  NULL  NULL
12  35  NULL  NULL
12  35  NULL  NULL
12  35  NULL  NULL
12  35  NULL  NULL
48  NULL  NULL  NULL
48  NULL  NULL  NULL
48  NULL  NULL  NULL
48  NULL  NULL  NULL
100 100 NULL  NULL
100 100 NULL  NULL
100 100 NULL  NULL
100 100 100 100


but real hive result is:
NULL    NULL    NULL    40
NULL    NULL    12      35
NULL    NULL    48      NULL
NULL    40      NULL    NULL
12      35      NULL    NULL
48      NULL    NULL    NULL
100     100     100     100

btw. result from postgresql is:
(SELECT *
   FROM myinput1 a
LEFT JOIN
   myinput1 b on
a.key > 40 AND
     a.value > 50 AND
     a.key = a.value AND
     b.key > 40 AND
     b.value > 50 AND
     b.key = b.value  ORDER BY a.key, a.value, b.key, b.value)
UNION (SELECT *
   FROM myinput1 a
RIGHT JOIN
   myinput1 b on
a.key > 40 AND
     a.value > 50 AND
     a.key = a.value AND
     b.key > 40 AND
     b.value > 50 AND
     b.key = b.value
ORDER BY a.key, a.value, b.key, b.value);
      |       |  12 |    35
   12 |    35 |     |
      |       |  48 |
   48 |       |     |
      |    40 |     |
      |       |     |    40
  100 |   100 | 100 |   100

so it's the same like in hive.

What is the right result for this full outer join in HiveQL, please?

-- 
Best Regards,
Martin Kudlej.
MRG/Grid & RHS-Hadoop Senior Quality Assurance Engineer
Red Hat Czech s.r.o.

Phone: +420 532 294 155
E-mail:mkudlej at redhat.com
IRC:   mkudlej at #brno, #messaging, #grid, #rhs, #distcomp

Mime
View raw message