Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C79FB10841 for ; Fri, 13 Dec 2013 15:38:13 +0000 (UTC) Received: (qmail 20693 invoked by uid 500); 13 Dec 2013 15:38:09 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 20603 invoked by uid 500); 13 Dec 2013 15:38:08 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 20564 invoked by uid 500); 13 Dec 2013 15:38:07 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 20560 invoked by uid 99); 13 Dec 2013 15:38:07 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Dec 2013 15:38:07 +0000 Date: Fri, 13 Dec 2013 15:38:07 +0000 (UTC) From: "Xuefu Zhang (JIRA)" To: hive-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-5996) Query for sum of a long column of a table with only two rows produces wrong result MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-5996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847579#comment-13847579 ] Xuefu Zhang commented on HIVE-5996: ----------------------------------- Yes, in theory, but unlikely. Current precision formula, 10 + p as the output precision, gives some room. Of course, if the input precision is already at maximum, then output type precision will the same as the input precision. In that case, for really big numbers, it can overflow. Anyway, I'm going to close the JIRA, so the discussion. > Query for sum of a long column of a table with only two rows produces wrong result > ---------------------------------------------------------------------------------- > > Key: HIVE-5996 > URL: https://issues.apache.org/jira/browse/HIVE-5996 > Project: Hive > Issue Type: Bug > Components: UDF > Affects Versions: 0.12.0 > Reporter: Xuefu Zhang > Assignee: Xuefu Zhang > Attachments: HIVE-5996.patch > > > {code} > hive> desc test2; > OK > l bigint None > hive> select * from test2; > OK > 6666666666666666666 > 5555555555555555555 > hive> select sum(l) from test2; > OK > -6224521851487329395 > {code} > It's believed that a wrap-around error occurred. It's surprising that it happens only with two rows. Same query in MySQL returns: > {code} > mysql> select sum(l) from test; > +----------------------+ > | sum(l) | > +----------------------+ > | 12222222222222222221 | > +----------------------+ > 1 row in set (0.00 sec) > {code} > Hive should accommodate large number of rows. Overflowing with only two rows is very unusable. -- This message was sent by Atlassian JIRA (v6.1.4#6159)