parquet-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ziva...@apache.org
Subject [parquet-format] branch master updated: PARQUET-1437: Misleading comment in parquet.thrift (#115)
Date Tue, 30 Oct 2018 09:32:41 GMT
This is an automated email from the ASF dual-hosted git repository.

zivanfi pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/parquet-format.git


The following commit(s) were added to refs/heads/master by this push:
     new e568691  PARQUET-1437: Misleading comment in parquet.thrift (#115)
e568691 is described below

commit e5686914f00026ccc9dceb0e2f6b1f18a1dbed0d
Author: Zoltan Ivanfi <zivanfi@apache.org>
AuthorDate: Tue Oct 30 10:32:36 2018 +0100

    PARQUET-1437: Misleading comment in parquet.thrift (#115)
    
    The documentation for list<ColumnOrder> column_orders stated that "Each
    sort order corresponds to one column, determined by its position in the
    list, matching the position of the column in the schema."
    
    However, in reality, while the order of elements in these two
    lists (schema and sort order) are the same, only leaf nodes are
    represented in the list of sort orders, so the positions do not match.
---
 src/main/thrift/parquet.thrift | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/main/thrift/parquet.thrift b/src/main/thrift/parquet.thrift
index 378aa47..c195177 100644
--- a/src/main/thrift/parquet.thrift
+++ b/src/main/thrift/parquet.thrift
@@ -903,8 +903,9 @@ struct FileMetaData {
 
   /**
    * Sort order used for the min_value and max_value fields of each column in
-   * this file. Each sort order corresponds to one column, determined by its
-   * position in the list, matching the position of the column in the schema.
+   * this file. Sort orders are listed in the order matching the columns in the
+   * schema. The indexes are not necessary the same though, because only leaf
+   * nodes of the schema are represented in the list of sort orders.
    *
    * Without column_orders, the meaning of the min_value and max_value fields is
    * undefined. To ensure well-defined behaviour, if min_value and max_value are


Mime
View raw message