drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Altekruse <ja...@dremio.com>
Subject Re: query pushdown into HBase subscan
Date Wed, 01 Jun 2016 00:30:50 GMT
The constant folding feature is turned on by default (and can be disabled
with planner.enable_constant_folding).

It should be able to work with UDFs, as it has access to all of the same
function definitions as our standard resolution/evaluation during full
execution.

In the plan that includes the full scan, in the filter above the scan does
your expression appear as written (i.e convert_from(...) =
hash_to_long('key_part1')), or has the right hand side been reduced to a
constant value?

The next thing that would probably be good to debug would be pre-computing
the right hand side and seeing if that gets pushed down.




Jason Altekruse
Software Engineer at Dremio
Apache Drill Committer

On Tue, May 31, 2016 at 5:04 PM, Aditya <adityakishore@gmail.com> wrote:

> Hi Andrey,
>
> Drill currently does require a constant value on the right hand side of a
> comparison operator to pushdown the filter.
>
> I believe that Jason had worked on constant folding feature which would
> evaluate a constant expression during planning phase and rewrite the plan
> to replace the expression with the corresponding constant value.
>
> Not sure if that works with UDFs as well.
>
> Jason?
>
> On Tue, May 31, 2016 at 3:54 PM, Andrey Gusev <andrey@siftscience.com>
> wrote:
>
> > Hello Drill,
> >
> > We're noticing somewhat of an odd behavior with the following query
> > against HBase table.
> >
> > They key of the table is roughly speaking
> > *8byteHash(string1)8byteHash(string2)*
> >
> >
> > SELECT CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'BIGINT') p1_long, ...
> from {table}
> > WHERE CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'BIGINT_BE') =
> hash_to_long('key_part1') limit 10
> >
> > The query does seem to work correctly in terms of result set but times
> out
> > on larger tables. The hash_to_long is udf that I wrote that converts a
> > string to long such that the above equality can be satisfied.
> >
> > It appears that it doesn't push down this into subscan (i.e. prefix HBase
> > scan) - while the operator profile shows HBASE_SUB_SCAN:
> >
> > [image: Inline image 1]
> >
> > The physical plan start with unconstrained full table scan:
> >
> > Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec
> [tableName={table}, startRow=null, stopRow=null, filter=null],
> >
> >
> > How can we force the where clause to be reflected into scan bounds?
> >
> > We're running latest Drill 1.6.
> >
> > Andrey
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message