drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From paul-rogers <...@git.apache.org>
Subject [GitHub] drill pull request #729: Drill 1328: Support table statistics for Parquet
Date Sat, 11 Feb 2017 21:57:14 GMT
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/729#discussion_r100676401
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java ---
    @@ -390,4 +391,15 @@
     
       String DYNAMIC_UDF_SUPPORT_ENABLED = "exec.udf.enable_dynamic_support";
       BooleanValidator DYNAMIC_UDF_SUPPORT_ENABLED_VALIDATOR = new BooleanValidator(DYNAMIC_UDF_SUPPORT_ENABLED,
true, true);
    +
    +  /**
    +   * Option whose value is a long value representing the number of bits required for
computing ndv (using HLL)
    +   */
    +  LongValidator NDV_MEMORY_LIMIT = new PositiveLongValidator("exec.statistics.ndv_memory_limit",
30, 20);
    +
    +  /**
    +   * Option whose value represents the current version of the statistics. Decreasing
the value will generate
    +   * the older version of statistics
    +   */
    +  LongValidator STATISTICS_VERSION = new NonNegativeLongValidator("exec.statistics.capability_version",
1, 1);
    --- End diff --
    
    Not sure this is clear, or desirable. When the stats are computed, they use the version
for the code that computes them, right? Are we saying that the user can select to use an older
version of the code for computation? Or that the code has if statements to support all old
versions? If so, this would be the only place in Drill to do so.
    
    On read size, doesn't the code have to use the version of code compatible with the version
of the stats in the file? How can I use, say, version 2 of stats with a version 3 file?
    
    Maybe some background explanation is needed (in the spec? Somewhere in the JIRA or code?)
to explain the use case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message