parquet-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ga...@apache.org
Subject [parquet-format] branch master updated: PARQUET-1487: Do not write original type for timezone-agnostic timestamps (#125)
Date Wed, 09 Jan 2019 12:35:38 GMT
This is an automated email from the ASF dual-hosted git repository.

gabor pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/parquet-format.git


The following commit(s) were added to refs/heads/master by this push:
     new 2b38663  PARQUET-1487: Do not write original type for timezone-agnostic timestamps
(#125)
2b38663 is described below

commit 2b38663a28ccd4156319c0bf7ae4e6280e0c6e2d
Author: Zoltan Ivanfi <zivanfi@apache.org>
AuthorDate: Wed Jan 9 13:35:34 2019 +0100

    PARQUET-1487: Do not write original type for timezone-agnostic timestamps (#125)
    
    Clarify in the comments that we should only map the new TIMESTAMP type
    to the old TIMESTAMP_MILLIS or TIMESTAMP_MICROS types when the semantics
    match (UTC normalized and the precision matches).
---
 LogicalTypes.md                | 60 +++++++++++++++++++++++++++++++++++-------
 src/main/thrift/parquet.thrift | 11 ++++++--
 2 files changed, 59 insertions(+), 12 deletions(-)

diff --git a/LogicalTypes.md b/LogicalTypes.md
index 4c103e2..be8734a 100644
--- a/LogicalTypes.md
+++ b/LogicalTypes.md
@@ -274,11 +274,13 @@ The sort order used for `TIME` is signed.
 
 #### Deprecated time ConvertedType
 
-`TIME_MILLIS` is the deprecated ConvertedType counterpart of `TIME` logical type
-with precision `MILLIS`. Like the logical type counterpart, it must annotate an `int32`
+`TIME_MILLIS` is the deprecated ConvertedType counterpart of a `TIME` logical
+type that is UTC normalized and has `MILLIS` precision. Like the logical type
+counterpart, it must annotate an `int32`.
 
-`TIME_MICROS` is the deprecated ConvertedType counterpart of `TIME` logical type
-with precision `MICROS`. Like the logical type counterpart, it must annotate an `int64`
+`TIME_MICROS` is the deprecated ConvertedType counterpart of a `TIME` logical
+type that is UTC normalized and has `MICROS` precision. Like the logical type
+counterpart, it must annotate an `int64`.
 
 *Backward compatibility:*
 
@@ -295,7 +297,8 @@ with precision `MICROS`. Like the logical type counterpart, it must annotate
an
         <th>ConvertedType</th>
     </tr>
     <tr>
-        <td rowspan="2" colspan="2">TimeType</td>
+        <td rowspan="6">TimeType</td>
+        <td rowspan="3">isAdjustedToUTC = true</td>
         <td>unit = MILLIS</td>
         <td>TIME_MILLIS</td>
     </tr>
@@ -303,6 +306,23 @@ with precision `MICROS`. Like the logical type counterpart, it must annotate
an
         <td>unit = MICROS</td>
         <td>TIME_MICROS</td>
     </tr>
+    <tr>
+        <td>unit = NANOS</td>
+        <td>-</td>
+    </tr>
+    <tr>
+        <td rowspan="3">isAdjustedToUTC = false</td>
+        <td>unit = MILLIS</td>
+        <td>-</td>
+    </tr>
+    <tr>
+        <td>unit = MICROS</td>
+        <td>-</td>
+    </tr>
+    <tr>
+        <td>unit = NANOS</td>
+        <td>-</td>
+    </tr>
 </table>
 
 ### TIMESTAMP
@@ -329,11 +349,13 @@ The sort order used for `TIMESTAMP` is signed.
 
 #### Deprecated timestamp ConvertedType
 
-`TIMESTAMP_MILLIS` is the deprecated ConvertedType counterpart of `TIMESTAMP` logical type
-with precision `MILLIS`. Like the logical type counterpart, it must annotate an `int64`
+`TIMESTAMP_MILLIS` is the deprecated ConvertedType counterpart of a `TIMESTAMP`
+logical type that is UTC normalized and has `MILLIS` precision. Like the logical
+type counterpart, it must annotate an `int64`.
 
-`TIMESTAMP_MICROS` is the deprecated ConvertedType counterpart of `TIMESTAMP` logical type
-with precision `MICROS`. Like the logical type counterpart, it must annotate an `int64`
+`TIMESTAMP_MICROS` is the deprecated ConvertedType counterpart of a `TIMESTAMP`
+logical type that is UTC normalized and has `MICROS` precision. Like the logical
+type counterpart, it must annotate an `int64`.
 
 *Backward compatibility:*
 
@@ -350,7 +372,8 @@ with precision `MICROS`. Like the logical type counterpart, it must annotate
an
         <th>ConvertedType</th>
     </tr>
     <tr>
-        <td rowspan="2" colspan="2">TimestampType</td>
+        <td rowspan="6">TimestampType</td>
+        <td rowspan="3">isAdjustedToUTC = true</td>
         <td>unit = MILLIS</td>
         <td>TIMESTAMP_MILLIS</td>
     </tr>
@@ -358,6 +381,23 @@ with precision `MICROS`. Like the logical type counterpart, it must annotate
an
         <td>unit = MICROS</td>
         <td>TIMESTAMP_MICROS</td>
     </tr>
+    <tr>
+        <td>unit = NANOS</td>
+        <td>-</td>
+    </tr>
+    <tr>
+        <td rowspan="3">isAdjustedToUTC = false</td>
+        <td>unit = MILLIS</td>
+        <td>-</td>
+    </tr>
+    <tr>
+        <td>unit = MICROS</td>
+        <td>-</td>
+    </tr>
+    <tr>
+        <td>unit = NANOS</td>
+        <td>-</td>
+    </tr>
 </table>
 
 ### INTERVAL
diff --git a/src/main/thrift/parquet.thrift b/src/main/thrift/parquet.thrift
index c195177..7a29b80 100644
--- a/src/main/thrift/parquet.thrift
+++ b/src/main/thrift/parquet.thrift
@@ -326,8 +326,15 @@ union LogicalType {
   4:  EnumType ENUM           // use ConvertedType ENUM
   5:  DecimalType DECIMAL     // use ConvertedType DECIMAL
   6:  DateType DATE           // use ConvertedType DATE
-  7:  TimeType TIME           // use ConvertedType TIME_MICROS or TIME_MILLIS
-  8:  TimestampType TIMESTAMP // use ConvertedType TIMESTAMP_MICROS or TIMESTAMP_MILLIS
+
+  // use ConvertedType TIME_MICROS for TIME(isAdjustedToUTC = true, unit = MICROS)
+  // use ConvertedType TIME_MILLIS for TIME(isAdjustedToUTC = true, unit = MILLIS)
+  7:  TimeType TIME
+
+  // use ConvertedType TIMESTAMP_MICROS for TIMESTAMP(isAdjustedToUTC = true, unit = MICROS)
+  // use ConvertedType TIMESTAMP_MILLIS for TIMESTAMP(isAdjustedToUTC = true, unit = MILLIS)
+  8:  TimestampType TIMESTAMP
+
   // 9: reserved for INTERVAL
   10: IntType INTEGER         // use ConvertedType INT_* or UINT_*
   11: NullType UNKNOWN        // no compatible ConvertedType


Mime
View raw message