hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Hanson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-4271) Limit precision of decimal type
Date Fri, 05 Apr 2013 21:47:15 GMT

    [ https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13624094#comment-13624094
] 

Eric Hanson commented on HIVE-4271:
-----------------------------------

I like this proposal. It will make life easier when the time comes have to implement support
for vectorized comparisons and arithmetic (https://issues.apache.org/jira/browse/HIVE-4160)
for decimal, because the data can be stored in an array of LONG or a pair of LONG values.
This will enable faster query execution. If there are defaults, please make the default be
such that the value will fit in 18 digits or less (a single LONG). Then the standard integer
arithmetic code path can be used for vectorized QE for the common case for decimal. Users
should be coached to use 18 digits or less for decimal unless their app really needs more.
                
> Limit precision of decimal type
> -------------------------------
>
>                 Key: HIVE-4271
>                 URL: https://issues.apache.org/jira/browse/HIVE-4271
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Gunther Hagleitner
>            Assignee: Gunther Hagleitner
>         Attachments: HIVE-4271.1.patch, HIVE-4271.2.patch, HIVE-4271.3.patch, HIVE-4271.4.patch
>
>
> The current decimal implementation does not limit the precision of the numbers. This
has a number of drawbacks. A maximum precision would allow us to:
> - Have SerDes/filformats store decimals more efficiently
> - Speed up processing by implementing operations w/o generating java BigDecimals
> - Simplify extending the datatype to allow for decimal(p) and decimal(p,s)
> - Write a more efficient BinarySortable SerDe for sorting/grouping/joining
> Exact numeric datatype are typically used to represent money, so if the limit is high
enough it doesn't really become an issue.
> A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 longs we
can represent 36 digits - which is what I propose as the limit.
> Final thought: It's easier to restrict this now and have the option to do the things
above than to try to do so once people start using the datatype.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message