drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "abdelhakim deneche" <adene...@gmail.com>
Subject Re: Review Request 28417: DRILL-1742 Use Hive stats when planning queries on Hive data sources
Date Tue, 25 Nov 2014 00:56:42 GMT

This is an automatically generated e-mail. To reply, visit:

(Updated Nov. 25, 2014, 12:56 a.m.)

Review request for drill.


handles the case when hive metadata doesn't include a numRows property.
When row count is available we return a GroupScanProperty.EXACT_ROW_COUNT

Bugs: DRILL-1742

Repository: drill-git


HiveScan.getSplits() already gets the table and partitions metadata using MetaStoreUtils.
We compute the total number of rows using the numRows property and store the computed number
of rows in rowCount attribute which is later returned by getScanStats().

Diffs (updated)

  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveScan.java ddbc100

Diff: https://reviews.apache.org/r/28417/diff/


created several partitioned and non-partitioned tables, loaded data in hive.

used explain plan to check the number of rows when the whole table is queried and also when
specific partitions are queried (to make sure the row count takes hive partition pruning into


abdelhakim deneche

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message