hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergio Peña (JIRA) <j...@apache.org>
Subject [jira] [Commented] (HIVE-7373) Hive should not remove trailing zeros for decimal numbers
Date Mon, 18 Aug 2014 15:58:18 GMT

    [ https://issues.apache.org/jira/browse/HIVE-7373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14100762#comment-14100762
] 

Sergio Peña commented on HIVE-7373:
-----------------------------------

There is a problem when storing the factor value when serializing the value 0. When serializing
0, it was deserializing 0.00

The bug is that factor was being serialized with no changes only when its sign was 1 (positive).
The other signs, 0 and negative, were negating the factor.
Then, in the deserialize function, the factor was deserializing positive and 0 values. And
only the negative value was negating the factor.

(serialize)
    int sign = dec.compareTo(HiveDecimal.ZERO);
    int factor = dec.precision() - dec.scale();
    factor = sign == 1 ? factor : -factor;                                 (BUG)
    writeByte(buffer, (byte) ( sign + 1), invert);

(deserialize)
    int b = buffer.read(invert) - 1;
    boolean positive = b != -1;
    if (!positive) {
           factor = -factor;
    }

Here's a data example about the bug:
length=1                  prec-scal	serialize |	deserialize scale = factor-length
-1.0    decimal(1,1)  factor=0  factor=-0 |	factor=0    scale =  0-1 (-1)
-1       decimal(1,0)  factor=1  factor=-1 |	factor=1    scale =  1-1  (0)
 0       decimal(1,0)  factor=1  factor=-1 |	factor=-1   scale = -1-1 (-2) BUG
 0.0    decimal(1,1)  factor=0  factor=-0 |	factor=-0   scale =  0-1 (-1)
 1       decimal(1,0)  factor=1  factor=1  |	factor=1    scale =  1-1  (0)
 1.0    decimal(1,1)  factor=0  factor=0  |	factor=0    scale =  0-1 (-1)

And with the fix on serialize:
   factor = sign != -1 ? factor : -factor;                                 (FIX)

 length=1                 prec-scal	serialize |	deserialize scale = factor-length
-1.0    decimal(1,1)  factor=0  factor=-0 |	factor=0    scale =  0-1 (-1)
-1       decimal(1,0)  factor=1  factor=-1 |	factor=1    scale =  1-1  (0)
 0       decimal(1,0)  factor=1  factor=1  |	factor=1    scale = -1-1  (0) FIX
 0.0    decimal(1,1)  factor=0  factor=0  |	factor=0    scale =  0-1 (-1)
 1       decimal(1,0)  factor=1  factor=1  |	factor=1    scale =  1-1  (0)
 1.0    decimal(1,1)  factor=0  factor=0  |	factor=0    scale =  0-1 (-1)

> Hive should not remove trailing zeros for decimal numbers
> ---------------------------------------------------------
>
>                 Key: HIVE-7373
>                 URL: https://issues.apache.org/jira/browse/HIVE-7373
>             Project: Hive
>          Issue Type: Bug
>          Components: Types
>    Affects Versions: 0.13.0, 0.13.1
>            Reporter: Xuefu Zhang
>            Assignee: Sergio Peña
>         Attachments: HIVE-7373.1.patch, HIVE-7373.2.patch, HIVE-7373.3.patch, HIVE-7373.4.patch,
HIVE-7373.5.patch, HIVE-7373.6.patch, HIVE-7373.6.patch
>
>
> Currently Hive blindly removes trailing zeros of a decimal input number as sort of standardization.
This is questionable in theory and problematic in practice.
> 1. In decimal context,  number 3.140000 has a different semantic meaning from number
3.14. Removing trailing zeroes makes the meaning lost.
> 2. In a extreme case, 0.0 has (p, s) as (1, 1). Hive removes trailing zeros, and then
the number becomes 0, which has (p, s) of (1, 0). Thus, for a decimal column of (1,1), input
such as 0.0, 0.00, and so on becomes NULL because the column doesn't allow a decimal number
with integer part.
> Therefore, I propose Hive preserve the trailing zeroes (up to what the scale allows).
With this, in above example, 0.0, 0.00, and 0.0000 will be represented as 0.0 (precision=1,
scale=1) internally.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message