parquet-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From nkol...@apache.org
Subject [parquet-format] branch master updated: PARQUET-1627: Update specification so that legacy timestamp logical types can be written for local semantics as well (#148)
Date Thu, 08 Aug 2019 07:54:34 GMT
This is an automated email from the ASF dual-hosted git repository.

nkollar pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/parquet-format.git


The following commit(s) were added to refs/heads/master by this push:
     new 345282c  PARQUET-1627: Update specification so that legacy timestamp logical types
can be written for local semantics as well (#148)
345282c is described below

commit 345282ce307a9c9dcc15b6f3c9106be379ec26ba
Author: Nándor Kollár <nandorKollar@users.noreply.github.com>
AuthorDate: Thu Aug 8 09:54:30 2019 +0200

    PARQUET-1627: Update specification so that legacy timestamp logical types can be written
for local semantics as well (#148)
---
 LogicalTypes.md                | 20 ++++++++++++++++----
 src/main/thrift/parquet.thrift |  8 ++++----
 2 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/LogicalTypes.md b/LogicalTypes.md
index 0398a5c..c605857 100644
--- a/LogicalTypes.md
+++ b/LogicalTypes.md
@@ -282,6 +282,12 @@ counterpart, it must annotate an `int32`.
 type that is UTC normalized and has `MICROS` precision. Like the logical type
 counterpart, it must annotate an `int64`.
 
+Despite there is no exact corresponding ConvertedType for local time semantic,
+in order to support forward compatibility with those libraries, which annotated
+their local time with legacy `TIME_MICROS` and `TIME_MILLIS` annotation,
+Parquet writer implementation *must* annotate local time with legacy annotations too,
+as shown below.
+
 *Backward compatibility:*
 
 | ConvertedType | LogicalType |
@@ -313,11 +319,11 @@ counterpart, it must annotate an `int64`.
     <tr>
         <td rowspan="3">isAdjustedToUTC = false</td>
         <td>unit = MILLIS</td>
-        <td>-</td>
+        <td>TIME_MILLIS</td>
     </tr>
     <tr>
         <td>unit = MICROS</td>
-        <td>-</td>
+        <td>TIME_MICROS</td>
     </tr>
     <tr>
         <td>unit = NANOS</td>
@@ -452,6 +458,12 @@ type counterpart, it must annotate an `int64`.
 logical type that is UTC normalized and has `MICROS` precision. Like the logical
 type counterpart, it must annotate an `int64`.
 
+Despite there is no exact corresponding ConvertedType for local timestamp semantic,
+in order to support forward compatibility with those libraries, which annotated
+their local timestamps with legacy `TIMESTAMP_MICROS` and `TIMESTAMP_MILLIS` annotation,
+Parquet writer implementation *must* annotate local timestamps with legacy annotations too,
+as shown below.
+
 *Backward compatibility:*
 
 | ConvertedType | LogicalType |
@@ -483,11 +495,11 @@ type counterpart, it must annotate an `int64`.
     <tr>
         <td rowspan="3">isAdjustedToUTC = false</td>
         <td>unit = MILLIS</td>
-        <td>-</td>
+        <td>TIMESTAMP_MILLIS</td>
     </tr>
     <tr>
         <td>unit = MICROS</td>
-        <td>-</td>
+        <td>TIMESTAMP_MICROS</td>
     </tr>
     <tr>
         <td>unit = NANOS</td>
diff --git a/src/main/thrift/parquet.thrift b/src/main/thrift/parquet.thrift
index 27dcd93..da90acd 100644
--- a/src/main/thrift/parquet.thrift
+++ b/src/main/thrift/parquet.thrift
@@ -327,12 +327,12 @@ union LogicalType {
   5:  DecimalType DECIMAL     // use ConvertedType DECIMAL
   6:  DateType DATE           // use ConvertedType DATE
 
-  // use ConvertedType TIME_MICROS for TIME(isAdjustedToUTC = true, unit = MICROS)
-  // use ConvertedType TIME_MILLIS for TIME(isAdjustedToUTC = true, unit = MILLIS)
+  // use ConvertedType TIME_MICROS for TIME(isAdjustedToUTC = *, unit = MICROS)
+  // use ConvertedType TIME_MILLIS for TIME(isAdjustedToUTC = *, unit = MILLIS)
   7:  TimeType TIME
 
-  // use ConvertedType TIMESTAMP_MICROS for TIMESTAMP(isAdjustedToUTC = true, unit = MICROS)
-  // use ConvertedType TIMESTAMP_MILLIS for TIMESTAMP(isAdjustedToUTC = true, unit = MILLIS)
+  // use ConvertedType TIMESTAMP_MICROS for TIMESTAMP(isAdjustedToUTC = *, unit = MICROS)
+  // use ConvertedType TIMESTAMP_MILLIS for TIMESTAMP(isAdjustedToUTC = *, unit = MILLIS)
   8:  TimestampType TIMESTAMP
 
   // 9: reserved for INTERVAL


Mime
View raw message