drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From paul-rogers <...@git.apache.org>
Subject [GitHub] drill pull request #644: DRILL-4980: Upgrading of the approach of parquet da...
Date Wed, 09 Nov 2016 16:59:20 GMT
Github user paul-rogers commented on a diff in the pull request:

    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/Metadata.java
    @@ -185,7 +185,8 @@ private Metadata(FileSystem fs, ParquetFormatConfig formatConfig)
    -    ParquetTableMetadata_v3 parquetTableMetadata = new ParquetTableMetadata_v3(true);
    +    ParquetTableMetadata_v3 parquetTableMetadata = new ParquetTableMetadata_v3(DrillVersionInfo.getVersion(),
    +        ParquetWriter.WRITER_VERSION);
    --- End diff --
    I'm a bit confused. The writer version applies to the Parquet files which Drill writes.
(Or, at least, that was the intention.)
    Here, we're talking about metadata. There may well be a metadata writer, but that should
be a different writer, with a different version.
    Not sure we want to initialize the metadata object with the current writer version: there
seems to be no correlation between the metadata object and the writer version.
    On the other hand, the metadata can certainly hold the writer version, but it must be
the value read from the Parquet file itself; not a value set by the code. Else, we have the
difficult problem of making sure that the code-set version number agrees with the actual version
number in the file.

If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.

View raw message