drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrey Gusev <and...@siftscience.com>
Subject query pushdown into HBase subscan
Date Tue, 31 May 2016 22:54:14 GMT
Hello Drill,

We're noticing somewhat of an odd behavior with the following query against
HBase table.

They key of the table is roughly speaking
*8byteHash(string1)8byteHash(string2)*


SELECT CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'BIGINT') p1_long, ...
from {table}
WHERE CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'BIGINT_BE') =
hash_to_long('key_part1') limit 10

The query does seem to work correctly in terms of result set but times out
on larger tables. The hash_to_long is udf that I wrote that converts a
string to long such that the above equality can be satisfied.

It appears that it doesn't push down this into subscan (i.e. prefix HBase
scan) - while the operator profile shows HBASE_SUB_SCAN:

[image: Inline image 1]

The physical plan start with unconstrained full table scan:

Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec
[tableName={table}, startRow=null, stopRow=null, filter=null],


How can we force the where clause to be reflected into scan bounds?

We're running latest Drill 1.6.

Andrey

Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message