drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "eugen yushin (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-4666) Pushdown doesn't apply for HBase with substr(key) from UT
Date Wed, 11 May 2016 13:03:12 GMT
eugen yushin created DRILL-4666:
-----------------------------------

             Summary: Pushdown doesn't apply for HBase with substr(key) from UT
                 Key: DRILL-4666
                 URL: https://issues.apache.org/jira/browse/DRILL-4666
             Project: Apache Drill
          Issue Type: Bug
    Affects Versions: 1.6.0
            Reporter: eugen yushin


Following [example|https://github.com/apache/drill/blob/95623912ebf348962fe8a8846c5f47c5fdcf2f78/contrib/storage-hbase/src/test/java/org/apache/drill/hbase/TestHBaseFilterPushDown.java]
section, running query from {{testFilterPushDownCompositeBigIntRowKey1()}} results in following
execution plan:
{code}
EXPLAIN PLAN FOR
SELECT
     CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'bigint_be') d
    ,CONVERT_FROM(BYTE_SUBSTR(row_key, 9, 8), 'bigint_be') id
    ,CONVERT_FROM(tableName.f.c, 'UTF8')
FROM hbase.`TestTableCompositeDate` tableName
WHERE
    CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'bigint_be') = cast(1409040000000 as bigint)
;
+------+------+
| text | json |
+------+------+
| 00-00    Screen
00-01      Project(d=[CONVERT_FROMBIGINT_BE(BYTE_SUBSTR($0, 1, 8))], id=[CONVERT_FROMBIGINT_BE(BYTE_SUBSTR($0,
9, 8))], EXPR$2=[CONVERT_FROMUTF8(ITEM($1, 'c'))])
00-02        SelectionVectorRemover
00-03          Filter(condition=[=(CONVERT_FROM(BYTE_SUBSTR($0, 1, 8), 'bigint_be'), 1409040000000)])
00-04            Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec [tableName=TestTableCompositeDate,
startRow=null, stopRow=null, filter=null], columns=[`*`]]])
{code}

>From the above, Drill uses full scan and then filters out rows by key substring started
from 1st position.

This query executes pretty fast in test dataset provided in repo, but performance dramatically
decreases with real use cases.

I've used _contrib\storage-hbase\src\test\java\org\apache\drill\hbase\TestTableGenerator.java_
to populate test table.

Moreover, [TestHBaseFilterPushDown|TestHBaseFilterPushDown.java] uses [runHBaseSQLVerifyCount|https://github.com/apache/drill/blob/95623912ebf348962fe8a8846c5f47c5fdcf2f78/contrib/storage-hbase/src/test/java/org/apache/drill/hbase/TestHBaseFilterPushDown.java]
to pass the tests. It checks result set count, and not execution plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message