drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-4980) Upgrading of the approach of parquet date correctness status detection
Date Thu, 10 Nov 2016 15:24:58 GMT

    [ https://issues.apache.org/jira/browse/DRILL-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15654323#comment-15654323
] 

ASF GitHub Bot commented on DRILL-4980:
---------------------------------------

Github user vdiravka commented on a diff in the pull request:

    https://github.com/apache/drill/pull/644#discussion_r87416017
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/Metadata.java
---
    @@ -185,7 +185,8 @@ private Metadata(FileSystem fs, ParquetFormatConfig formatConfig)
{
             childFiles.add(file);
           }
         }
    -    ParquetTableMetadata_v3 parquetTableMetadata = new ParquetTableMetadata_v3(true);
    +    ParquetTableMetadata_v3 parquetTableMetadata = new ParquetTableMetadata_v3(DrillVersionInfo.getVersion(),
    +        ParquetWriter.WRITER_VERSION);
    --- End diff --
    
    `is.date.correct` or `parquet-writer.version` were needed in metadata cache file for quick
detection of date values correctness. Otherwise need to check `files.rowGroups.columns.mxValue`
values from this cache file. 
    But thought a little, I've understood that due to new added `ParquetTableMetadata_v3`
we can check:
    If version of parquet metadata cache file is 3, the date values are definitely correct.
Otherwise (when parquet metadata cache file was generated earlier) need to check date values
from this file. 
    So `writerVersion` is redundant in the `ParquetTableMetadataBase` now. I deleted it. Please
approve does it make sense?


> Upgrading of the approach of parquet date correctness status detection
> ----------------------------------------------------------------------
>
>                 Key: DRILL-4980
>                 URL: https://issues.apache.org/jira/browse/DRILL-4980
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - Parquet
>    Affects Versions: 1.8.0
>            Reporter: Vitalii Diravka
>            Assignee: Vitalii Diravka
>             Fix For: 1.9.0
>
>
> This jira is an addition for the [DRILL-4203|https://issues.apache.org/jira/browse/DRILL-4203].
> The date correctness label for the new generated parquet files should be upgraded. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message