hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Blaine Elliott <bla...@chegg.com>
Subject NULL values and != operations
Date Wed, 05 Feb 2014 18:49:36 GMT
I have come across a strange situation in hive and I want to know if there is an explanation.
 The CASE operation below does not work when the operator is != but does work then the operator
is =.  Maybe it is true that an = operation is valid if a value is NULL.  But an != operation
is invalid if a value is NULL.  That seems bizarre.  Is this a bug or can this be explained?

I am using Amazon EMR w/hadoop v1.0.3 & hive v0.11.0

-- the following SQL results are expected such that the last column is 1 or 0
SELECT
    user_name
  , val0
  , val1
  , CASE WHEN val0 = val1 THEN 1 ELSE 0 END
FROM
    (
    SELECT
        user_name
      , MIN(STR_TO_MAP(kvp, "&", "=")['val0']) AS val0
      , MIN(STR_TO_MAP(kvp, "&", "=")['val1']) AS val1
    FROM
        stgdb.fact_webrequest
    GROUP BY
        user_name
    ) x;

user0 42.01   42.01   1
user1 NULL    14.1301 0
user2 NULL    15.03   0
user3 NULL    43.01   0
user4 NULL    40.05   0
user5 NULL    13.1305 0
user6 51.0913 51.0913 1
user7 NULL    11.0701 0
user8 NULL    52.02   0

-- the following SQL results are strange such that the last column is always 0
SELECT
    user_name
  , val0
  , val1
  , CASE WHEN val0 != val1 THEN 1 ELSE 0 END
FROM
    (
    SELECT
        user_name
      , MIN(STR_TO_MAP(kvp, "&", "=")['val0']) AS val0
      , MIN(STR_TO_MAP(kvp, "&", "=")['val1']) AS val1
    FROM
        stgdb.fact_webrequest
    GROUP BY
        user_name
    ) x;

user0 42.01   42.01   0
user1 NULL    14.1301 0
user2 NULL    15.03   0
user3 NULL    43.01   0
user4 NULL    40.05   0
user5 NULL    13.1305 0
user6 51.0913 51.0913 0
user7 NULL    11.0701 0
user8 NULL    52.02   0


Blaine Elliott
Chegg | Senior Data Engineer
* 805 637 4556 | * blaine@chegg.com<mailto:blaine@chegg.com>

Mime
View raw message