hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-14893) vectorized execution may convert LongCV to smaller types incorrectly
Date Wed, 05 Oct 2016 01:35:21 GMT

     [ https://issues.apache.org/jira/browse/HIVE-14893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sergey Shelukhin updated HIVE-14893:
------------------------------------
    Description: 
See the results for vectorized in decimal_11 test added in HIVE-14863. 
We cast decimal to various int types; the cast is specialized for each type on non-vectorized
side; on vectorized side, it's only specialized for LongColumnVector, so all the decimals
get converted to longs. LongColumnVector gets converted to a proper type in some other mysterious
place later, and tiny/small/regular ints become truncated at that point.
Logically, I am not sure if every vectorized expression should be aware of the underlying
type for the LongColumnVector (that seems implausible - I am not sure if type information
is even available, and if yes it doesn't look like it's used in other places), or if they
long-to-smaller-type automatic conversion should be fixed to produce nulls on overflow.
However it seems like a good idea to do the latter in any case, to have a catch-all for all
the vectorized expressions that might treat LongCV as representing longs at all times.

  was:
See the results for vectorized in decimal_11 test added in HIVE-14863. 
We cast decimal to various int types; the cast is specialized for each type on non-vectorized
side; on vectorized side, it's only specialized for LongColumnVector. LongColumnVector gets
converted to a proper type in some other mysterious place later, and tiny/small/regular int
become truncated.
Logically, I am not sure if every vectorized expression should be aware of the underlying
type for the LongColumnVector (that seems implausible - I am not sure if type information
is even available, and if yes it doesn't look like it's used in other places), or if they
long-to-smaller-type automatic conversion should be fixed to produce nulls on overflow.
However it seems like a good idea to do the latter in any case, to have a catch-all for all
the vectorized expressions that might treat LongCV as representing longs at all times.


> vectorized execution may convert LongCV to smaller types incorrectly
> --------------------------------------------------------------------
>
>                 Key: HIVE-14893
>                 URL: https://issues.apache.org/jira/browse/HIVE-14893
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Matt McCline
>
> See the results for vectorized in decimal_11 test added in HIVE-14863. 
> We cast decimal to various int types; the cast is specialized for each type on non-vectorized
side; on vectorized side, it's only specialized for LongColumnVector, so all the decimals
get converted to longs. LongColumnVector gets converted to a proper type in some other mysterious
place later, and tiny/small/regular ints become truncated at that point.
> Logically, I am not sure if every vectorized expression should be aware of the underlying
type for the LongColumnVector (that seems implausible - I am not sure if type information
is even available, and if yes it doesn't look like it's used in other places), or if they
long-to-smaller-type automatic conversion should be fixed to produce nulls on overflow.
> However it seems like a good idea to do the latter in any case, to have a catch-all for
all the vectorized expressions that might treat LongCV as representing longs at all times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message