Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8144A112AB for ; Mon, 18 Aug 2014 15:58:19 +0000 (UTC) Received: (qmail 76675 invoked by uid 500); 18 Aug 2014 15:58:19 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 76591 invoked by uid 500); 18 Aug 2014 15:58:19 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 76578 invoked by uid 500); 18 Aug 2014 15:58:18 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 76575 invoked by uid 99); 18 Aug 2014 15:58:18 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 Aug 2014 15:58:18 +0000 Date: Mon, 18 Aug 2014 15:58:18 +0000 (UTC) From: =?utf-8?Q?Sergio_Pe=C3=B1a_=28JIRA=29?= To: hive-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-7373) Hive should not remove trailing zeros for decimal numbers MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-7373?page=3Dcom.atlassian.= jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D14100= 762#comment-14100762 ]=20 Sergio Pe=C3=B1a commented on HIVE-7373: ----------------------------------- There is a problem when storing the factor value when serializing the value= 0. When serializing 0, it was deserializing 0.00 The bug is that factor was being serialized with no changes only when its s= ign was 1 (positive). The other signs, 0 and negative, were negating the fa= ctor. Then, in the deserialize function, the factor was deserializing positive an= d 0 values. And only the negative value was negating the factor. (serialize) int sign =3D dec.compareTo(HiveDecimal.ZERO); int factor =3D dec.precision() - dec.scale(); factor =3D sign =3D=3D 1 ? factor : -factor; = (BUG) writeByte(buffer, (byte) ( sign + 1), invert); (deserialize) int b =3D buffer.read(invert) - 1; boolean positive =3D b !=3D -1; if (!positive) { factor =3D -factor; } Here's a data example about the bug: length=3D1 prec-scal=09serialize |=09deserialize scale =3D= factor-length -1.0 decimal(1,1) factor=3D0 factor=3D-0 |=09factor=3D0 scale =3D = 0-1 (-1) -1 decimal(1,0) factor=3D1 factor=3D-1 |=09factor=3D1 scale =3D = 1-1 (0) 0 decimal(1,0) factor=3D1 factor=3D-1 |=09factor=3D-1 scale =3D = -1-1 (-2) BUG 0.0 decimal(1,1) factor=3D0 factor=3D-0 |=09factor=3D-0 scale =3D = 0-1 (-1) 1 decimal(1,0) factor=3D1 factor=3D1 |=09factor=3D1 scale =3D = 1-1 (0) 1.0 decimal(1,1) factor=3D0 factor=3D0 |=09factor=3D0 scale =3D = 0-1 (-1) And with the fix on serialize: factor =3D sign !=3D -1 ? factor : -factor; = (FIX) length=3D1 prec-scal=09serialize |=09deserialize scale =3D= factor-length -1.0 decimal(1,1) factor=3D0 factor=3D-0 |=09factor=3D0 scale =3D = 0-1 (-1) -1 decimal(1,0) factor=3D1 factor=3D-1 |=09factor=3D1 scale =3D = 1-1 (0) 0 decimal(1,0) factor=3D1 factor=3D1 |=09factor=3D1 scale =3D = -1-1 (0) FIX 0.0 decimal(1,1) factor=3D0 factor=3D0 |=09factor=3D0 scale =3D = 0-1 (-1) 1 decimal(1,0) factor=3D1 factor=3D1 |=09factor=3D1 scale =3D = 1-1 (0) 1.0 decimal(1,1) factor=3D0 factor=3D0 |=09factor=3D0 scale =3D = 0-1 (-1) > Hive should not remove trailing zeros for decimal numbers > --------------------------------------------------------- > > Key: HIVE-7373 > URL: https://issues.apache.org/jira/browse/HIVE-7373 > Project: Hive > Issue Type: Bug > Components: Types > Affects Versions: 0.13.0, 0.13.1 > Reporter: Xuefu Zhang > Assignee: Sergio Pe=C3=B1a > Attachments: HIVE-7373.1.patch, HIVE-7373.2.patch, HIVE-7373.3.pa= tch, HIVE-7373.4.patch, HIVE-7373.5.patch, HIVE-7373.6.patch, HIVE-7373.6.p= atch > > > Currently Hive blindly removes trailing zeros of a decimal input number a= s sort of standardization. This is questionable in theory and problematic i= n practice. > 1. In decimal context, number 3.140000 has a different semantic meaning = from number 3.14. Removing trailing zeroes makes the meaning lost. > 2. In a extreme case, 0.0 has (p, s) as (1, 1). Hive removes trailing zer= os, and then the number becomes 0, which has (p, s) of (1, 0). Thus, for a = decimal column of (1,1), input such as 0.0, 0.00, and so on becomes NULL be= cause the column doesn't allow a decimal number with integer part. > Therefore, I propose Hive preserve the trailing zeroes (up to what the sc= ale allows). With this, in above example, 0.0, 0.00, and 0.0000 will be rep= resented as 0.0 (precision=3D1, scale=3D1) internally. -- This message was sent by Atlassian JIRA (v6.2#6252)