hive-dev mailing list archives

Site index · List index
Message view
Top
From "Eric Hanson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-5762) Implement vectorized support for the DECIMAL data type
Date Mon, 09 Dec 2013 22:04:07 GMT
```
[ https://issues.apache.org/jira/browse/HIVE-5762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13843601#comment-13843601
]

Eric Hanson commented on HIVE-5762:
-----------------------------------

I'm thinking about using this basic structure for a decimal column vector for limited-precision
decimals. Then a utility package of static functions can be implemented to do decimal arithmetic
on individual values. It should be possible to make this a lot faster than if the code relies
on java.math.BigDecimal, because it is less general, and because new() and garbage collection
will be reduced.

{code}
public class DecimalColumnVector extends ColumnVector {
public int precision; // precision of all elements in vector (max 38)
public int scale;     // scale of all elements in vector (max 38)
public static final int WORDS_PER_VALUE = 4;

/**
* Logically a vector of 128 bit unsigned int, that is "little-endian."  This
* means that for a value v, v[0] is least significant. The 4-word
* 32 bit values are treated as unsigned. However,the high-order bit
* of the highest word (word 3) must be 0.
*/
public int[][] vector;
public byte[] sign;  // -1 if negative, 0 if zero, 1 if positive

public DecimalColumnVector() {
super(VectorizedRowBatch.DEFAULT_SIZE);
final int len = VectorizedRowBatch.DEFAULT_SIZE;
vector = new int[len][];
for (int i = 0; i < len; i++) {
vector[i] = new int[WORDS_PER_VALUE];
}
sign = new byte[len];
}
...
}
{code}

> Implement vectorized support for the DECIMAL data type
> ------------------------------------------------------
>
>                 Key: HIVE-5762
>                 URL: https://issues.apache.org/jira/browse/HIVE-5762
>             Project: Hive
>            Reporter: Eric Hanson
>
> Add support to allow queries referencing DECIMAL columns and expression results to run
efficiently in vectorized mode.  Include unit tests and end-to-end tests.
> Before starting or at least going very far, please write design specification (a new
section for the design spec attached to HIVE-4160) for how support for the different DECIMAL
types should work in vectorized mode, and the roadmap, and have it reviewed.
> It may be feasible to re-use LongColumnVector and related VectorExpression classes for
fixed-point decimal in certain data ranges. That should be at least considered to get faster
performance and save code. For unlimited precision DECIMAL, a new column vector subtype may
be needed, or a BytesColumnVector could be re-used.

--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

```
Mime
View raw message