hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 丁桂涛(桂花) <dinggui...@baixing.com>
Subject Hive UDF gives duplicate result regardless of parameters, when nested in a subquery
Date Wed, 23 Jul 2014 05:34:19 GMT
Recently I developed a Hive Generic UDF *getad*. It accepts a map type and
a string type parameter and outputs a string value. But I found the UDF
output really confusing in different conditions.

Condition A:

select
  getad(map_col, 'tp') as tp,
  getad(map_col, 'p') as p,
  getad(map_col, 'sp') as sp
from
  table_name
where
  id = xxxx;

The output is right: 'tp', 'p', 'sp'.

Condition B:

select
  array(tp, p, sp) as ps
from
  (
  select
    getad(map_col, 'tp') as tp,
    getad(map_col, 'p') as p,
    getad(map_col, 'sp') as sp
  from
    table_name
  where
    id = xxxx
  ) t;

The output is wrong: 'tp', 'tp', 'tp'. And the following query outputs the
same result:

select
  array(
    getad(map_col, 'tp'),
    getad(map_col, 'p'),
    getad(map_col, 'sp')
  ) as ps
from
  table_name
where
  id = xxxx;

Could you please provide me some hints on this? Thanks!

-- 
丁桂涛

Mime
View raw message