drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From paul-rogers <...@git.apache.org>
Subject [GitHub] drill pull request #644: DRILL-4980: Upgrading of the approach of parquet da...
Date Thu, 03 Nov 2016 16:46:28 GMT
Github user paul-rogers commented on a diff in the pull request:

    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/Metadata.java
    @@ -927,15 +927,11 @@ public void setMax(Object max) {
         @JsonProperty List<ParquetFileMetadata_v2> files;
         @JsonProperty List<String> directories;
         @JsonProperty String drillVersion;
    -    @JsonProperty boolean isDateCorrect;
    +    @JsonProperty int writerVersion;
         public ParquetTableMetadata_v2() {
    -      super();
    -    }
    -    public ParquetTableMetadata_v2(boolean isDateCorrect) {
           this.drillVersion = DrillVersionInfo.getVersion();
    -      this.isDateCorrect = isDateCorrect;
    +      this.writerVersion = ParquetWriter.WRITER_VERSION;
    --- End diff --
    We set the writer version to the current version when we create the metadata. Is this
same metadata used for both read and write? If so, we have the potential for a nasty bug.
A (new) reader fails to set the writerVersion value from actual file metadata. The value will
default to the latest, even if the file itself happens to be older.
    I wonder if it makes sense to pass the version into the constructor. The Writer passes
in the current writer version. The reader must pass in the value found in the file.
    Or, is this metadata used only for writing, but not reading? If that is true perhaps we
can document that in the code somewhere. (I looked but did not anything.)
    Or, is this metadata cached from scanning actual files? If so, isn't defaulting the writer
version simply asking for trouble if someone forgets to set this field based on actual file

If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.

View raw message