hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pengcheng Xiong (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-9717) The max/min function used by AggrStats for decimal type is not what we expected
Date Wed, 18 Feb 2015 19:17:12 GMT
Pengcheng Xiong created HIVE-9717:
-------------------------------------

             Summary: The max/min function used by AggrStats for decimal type is not what
we expected
                 Key: HIVE-9717
                 URL: https://issues.apache.org/jira/browse/HIVE-9717
             Project: Hive
          Issue Type: Bug
            Reporter: Pengcheng Xiong


In current version hive-schema-1.2.0, in TABLE PART_COL_STATS, we store the "BIG_DECIMAL_LOW_VALUE"
and "BIG_DECIMAL_HIGH_VALUE" as varchar. For example,

derby
"BIG_DECIMAL_LOW_VALUE" VARCHAR(4000), "BIG_DECIMAL_HIGH_VALUE" VARCHAR(4000)

mssql
BIG_DECIMAL_HIGH_VALUE varchar(255) NULL,
    BIG_DECIMAL_LOW_VALUE varchar(255) NULL,

mysql
`BIG_DECIMAL_LOW_VALUE` varchar(4000) CHARACTER SET latin1 COLLATE latin1_bin,
 `BIG_DECIMAL_HIGH_VALUE` varchar(4000) CHARACTER SET latin1 COLLATE latin1_bin,

oracle
BIG_DECIMAL_LOW_VALUE VARCHAR2(4000),
 BIG_DECIMAL_HIGH_VALUE VARCHAR2(4000),

postgres
"BIG_DECIMAL_LOW_VALUE" character varying(4000) DEFAULT NULL::character varying,
 "BIG_DECIMAL_HIGH_VALUE" character varying(4000) DEFAULT NULL::character varying,

And, when we do the aggrstats, we do a MAX/MIN of all the BIG_DECIMAL_HIGH_VALU/BIG_DECIMAL_LOW_VALUEE
of partitions. We are expecting a max/min of a decimal (a number). However, it is actually
a max/min of a varchar (a string). As a result, '900' is more than '1000'. This also affects
the extrapolation of the status. The proposed solution is to use a CAST function to cast it
to decimal. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message