hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashutosh Chauhan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-16418) Allow HiveKey to skip some bytes for comparison
Date Mon, 17 Apr 2017 23:51:42 GMT

    [ https://issues.apache.org/jira/browse/HIVE-16418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15971840#comment-15971840
] 

Ashutosh Chauhan commented on HIVE-16418:
-----------------------------------------

We need to think about storage type for Timestamp in different stages of query processing:

* On-disk format : Whether to store TZ or not. Primary concern is fidelity of original data
and secondary concern is storage efficiency.
* In-memory format : On which computations are performed. As I see it, our current Timestamp
choice here is inappropriate. Issue is java.sql.Timestamp (which implicitly assumes local
Timezone) doesnt correspond to either sql Timestamp (which is essentially zoneless ) or Timestamp
with Timezone (which has zone, but java.sql.Timestamp doesnt allow you to set). As I suggested
in-memory representation (i.e. on which all computations are performed) should either directly
use  LocalTimeZone and ZonedTimeZone or model its behavior on it.
* Serialization format: To transfer timestamp between different vertices. Here primary concern
is performance which comes if TZ is stored separately.

In light of above, I am ok with your proposal of using choice #2, but I think you still need
to think about in-memory format. Because apart from to_utc_timestamp and related udfs implementing
new type : Timestamp with Time Zone with java.sql.Timestamp will be error-prone.

> Allow HiveKey to skip some bytes for comparison
> -----------------------------------------------
>
>                 Key: HIVE-16418
>                 URL: https://issues.apache.org/jira/browse/HIVE-16418
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Rui Li
>            Assignee: Rui Li
>         Attachments: HIVE-16418.1.patch
>
>
> The feature is required when we have to serialize some fields and prevent them from being
used in comparison, e.g. HIVE-14412.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message