drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From gparai <...@git.apache.org>
Subject [GitHub] drill pull request #729: Drill 1328 r4
Date Wed, 25 Jan 2017 23:06:41 GMT
GitHub user gparai opened a pull request:

    https://github.com/apache/drill/pull/729

    Drill 1328 r4

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gparai/drill Drill-1328-r4

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/729.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #729
    
----
commit 7ed459d340d99ba1e4a8df6b66465a272ce51f02
Author: Cliff Buchanan <cbuchanan@maprtech.com>
Date:   2014-08-21T21:59:53Z

    DRILL-1328: Support table statistics
    
    PRE: Add "append" concept to directory write.
    
    * This is so stats can be stored in [table].stats.drill and be appended to be writing
a new file into the directory.
    
    FUNCS: Statistics functions as UDFs:
    Currently using FieldReader to ensure consistent output type so that Unpivot doesn't get
confused. All stats columns should be Nullable, so that stats functions can return NULL when
N/A.
    * custom versions of "count" that always return BigInt
    * HyperLogLog based NDV that returns BigInt that works only on VarChars
    * HyperLogLog with binary output that only works on VarChars
    
    OPS: Updated protobufs for new ops
    
    OPS: Implemented StatisticsAggregate
    
    OPS: Implemented StatisticsUnpivot
    
    ANALYZE: AnalyzeTable functionality
    * JavaCC syntax more-or-less copied from LucidDB.
    * (Basic) AnalyzePrule: DrillAnalyzeRel -> UnpivotPrel and StatsAggPrel
    
    ANALYZE: Add getMetadataTable() to AbstractSchema
    
    USAGE: Change field access in QueryWrapper
    
    USAGE: Add getDrillTable() to DrillScanRelBase and ScanPrel
    * since ScanPrel does not inherit from DrillScanRelBase, this requires adding a DrillTable
to the constructor
    * This is done so that a custom ReflectiveRelMetadataProvider can access the DrillTable
associated with Logical/Physical scans.
    
    USAGE: Attach DrillTableMetadata to DrillTable.
    * DrillTableMetadata represents the data scanned from a corresponding ".stats.drill" table
    * In order to avoid doing query execution right after the ".stats.drill" table is found,
metadata is not actually collected until the MaterializationVisitor is used.
    ** Currently, the metadata source must be a string (so that a SQL query can be created).
Doing this with a table is probably more complicated.
    ** Query is set up to extract only the most recent statistics results for each column.
    
    USAGE: Configure DrillJoinRelBase to use NDV metadata when available.
    
    USAGE: attach metadata to table
    
    USAGE: implement optiq provider

commit 9771a732ad9d266937c5f5a263cca2e09ee6f4f6
Author: Gautam Parai <gparai@maprtech.com>
Date:   2014-08-21T21:59:53Z

    DRILL-1328: Support table statistics
    
    PRE: Add "append" concept to directory write.
    
    * This is so stats can be stored in [table].stats.drill and be appended to be writing
a new file into the directory.
    
    FUNCS: Statistics functions as UDFs:
    Currently using FieldReader to ensure consistent output type so that Unpivot doesn't get
confused. All stats columns should be Nullable, so that stats functions can return NULL when
N/A.
    * custom versions of "count" that always return BigInt
    * HyperLogLog based NDV that returns BigInt that works only on VarChars
    * HyperLogLog with binary output that only works on VarChars
    
    OPS: Updated protobufs for new ops
    
    OPS: Implemented StatisticsAggregate
    
    OPS: Implemented StatisticsUnpivot
    
    ANALYZE: AnalyzeTable functionality
    * JavaCC syntax more-or-less copied from LucidDB.
    * (Basic) AnalyzePrule: DrillAnalyzeRel -> UnpivotPrel and StatsAggPrel
    
    ANALYZE: Add getMetadataTable() to AbstractSchema
    
    USAGE: Change field access in QueryWrapper
    
    USAGE: Add getDrillTable() to DrillScanRelBase and ScanPrel
    * since ScanPrel does not inherit from DrillScanRelBase, this requires adding a DrillTable
to the constructor
    * This is done so that a custom ReflectiveRelMetadataProvider can access the DrillTable
associated with Logical/Physical scans.
    
    USAGE: Attach DrillTableMetadata to DrillTable.
    * DrillTableMetadata represents the data scanned from a corresponding ".stats.drill" table
    * In order to avoid doing query execution right after the ".stats.drill" table is found,
metadata is not actually collected until the MaterializationVisitor is used.
    ** Currently, the metadata source must be a string (so that a SQL query can be created).
Doing this with a table is probably more complicated.
    ** Query is set up to extract only the most recent statistics results for each column.
    
    USAGE: Configure DrillJoinRelBase to use NDV metadata when available.
    
    USAGE: attach metadata to table
    
    USAGE: implement optiq provider

commit 2bd531a3885415496c0c5f5ea445aedbd7aad07c
Author: Gautam Parai <gparai@maprtech.com>
Date:   2016-11-24T03:30:18Z

    Parallel statistics computation

commit 84189392719325e25a87c65ac7fb46e38b503882
Author: Aman Sinha <asinha@maprtech.com>
Date:   2016-11-30T01:53:52Z

    Fix distribution traits for 2 phase analyze.  Add statistics_merge operator to CoreOperatorType
and fix enum values.

commit 32123eeeeb9649c37ba90bb06b9ecabd62323b67
Author: Gautam Parai <gparai@maprtech.com>
Date:   2016-12-02T21:23:34Z

    Add support for all stats

commit 218ab2d7c9ffe7263a9f458d0fa6025824b62e97
Author: Gautam Parai <gparai@maprtech.com>
Date:   2016-12-09T22:48:04Z

    Old costing without statistics and new costing with statistics

commit 2f4ad909326b9020d87cfa70169d8d403aaa005c
Author: Gautam Parai <gparai@maprtech.com>
Date:   2017-01-05T23:48:07Z

    Fix Calcite AnalyzeSimpleEquiJoin

commit a92ddacfd84b378eb375f207990a8dbebba705cd
Author: Gautam Parai <gparai@maprtech.com>
Date:   2017-01-06T21:40:23Z

    Fix NDV overestimate when groupKey relies on one side of the join

commit af94db7a4e3d91d276be0202406dbc1cd748345e
Author: Gautam Parai <gparai@maprtech.com>
Date:   2017-01-25T00:13:35Z

    Code cleanup

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message