hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-14995) double conversion can corrupt partition column values for insert overwrite with DP
Date Tue, 18 Oct 2016 01:20:58 GMT
Sergey Shelukhin created HIVE-14995:
---------------------------------------

             Summary: double conversion can corrupt partition column values for insert overwrite
with DP
                 Key: HIVE-14995
                 URL: https://issues.apache.org/jira/browse/HIVE-14995
             Project: Hive
          Issue Type: Bug
            Reporter: Sergey Shelukhin
            Priority: Critical


{noformat}
set hive.mapred.mode=nonstrict;
set hive.explain.user=false;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.fetch.task.conversion=none;

drop table iow1; 
create table iow1(key int) partitioned by (key2 int);

select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc limit
1;

explain
insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc limit 1;

insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc limit 1;
{noformat}

The result of the select query has the column converted to double (because src.key is string).

The value is converted correctly to integer for the regular column, but not for partition
column.
{noformat}
498	499.0	499.0
{noformat}

Explain for insert (extracted)
{noformat}
    Map Reduce
      Map Operator Tree:
...
              Select Operator
                expressions: (UDFToDouble(key) + 1.0) (type: double)
...
                Reduce Output Operator
                  key expressions: _col0 (type: double)
                  sort order: -
...
      Reduce Operator Tree:
        Select Operator
          expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey0 (type: double)
...
            Select Operator
              expressions: UDFToInteger(_col0) (type: int), _col1 (type: double)
 .... followed by FSOP and load into table
{noformat}
The result of the select from the resulting table is:
{noformat}
POSTHOOK: query: select key, key2 from iow1
...
POSTHOOK: Input: default@iow1@key2=499.0
...
499	NULL
{noformat}
Woops!





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message