drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jinfeng Ni <jinfengn...@gmail.com>
Subject Re: In list filter evaluation : room for improvement in run-time code generation.
Date Fri, 11 Sep 2015 17:57:38 GMT
The patch I submitted tries to address the redundant retrieval when
the field references are used in the same JBlock.  For instance,
" a + a * 3.5 * sqrt(a) ", each reference of "a" would originally lead
to code like the following:

76:                 IntHolder out3 = new IntHolder();
77:                 {
78:                     out3 .value = vv0 .getAccessor().get((inIndex));
79:                 }

Now, it would create only IntHolder and shared across the entire
JBlock.

The short circuit evaluation makes a bit tricky for the sharing of
common expression. In some cases, the short circuit will skip
some code, including the initialization of holder shared by multiple
reference, which would cause problem. That's why I only remove
the redundant retrieval within the same JBlock.

The code does not intend to reduce the redundancy for general
common expression; it only handles the common field reference.
If we have " SELECT a + 100  as COL1,  sqrt (a  + 100) as COL2 ",
the CodeGenerator does not detect "a + 100" is a common expression;
in stead, I think it makes sense for query planner to put the common
expression into a Project operator.

      Project  ( $0 as Col1,  sqrt ($0) as COL2)
         |
     Project  ( a + 100 as $0)

I just realized the current patch does not remove redundancy across
multiple expressions in the Project operator.

    SELECT    a + 100,  a * 100


You are right that I should put some logic in ClassGenerator level.
I'll re-submit a patch shortly.



On Thu, Sep 10, 2015 at 7:00 PM, Jacques Nadeau <jacques@dremio.com> wrote:
> I haven't looked at your patch yet. Do you try to address the issue where
> we do redundant retrieval for all situations or only for some. Seems like
> it should be managed at the class generator level since it could have this
> context.
>
> Also note that you probably would see a substantial benefit if you modify
> the instance level variables to avoid object creation (the place where
> scalar replacement doesn't fully solve the problem).
>
>
> On Sep 9, 2015 1:31 PM, "Jinfeng Ni" <jinfengni99@gmail.com> wrote:
>
>> Weeks ago there was a message on drill user list, reporting performance
>> issues caused by in list filter [1].  The query has filter:
>>
>> WHERE
>>    c0 IN (v_00, v_01, v_02, v_03, ... )
>> OR
>>    c1 IN (v_11, v_11, v_12, v_13, ....)
>> OR
>>    c2 IN ...
>> OR
>>    c3 IN ...
>> OR
>>    ....
>>
>> The profile shows that most of query time is spent on filter evaluation.
>> One workaround that we recommend was to re-write the query so that the
>> planner would convert in list into join operation. Turns out that
>> converting
>> into join did help improve performance, but not as much as we wanted.
>>
>> The original query has parquet as the data source. Therefore, the ideal
>> solution is parquet filter pushdown, which DRILL-1950 would address.
>>
>> On the other hand, I noticed that there seems to be room for improvement
>> in the run-time generated code. In particular, for " c0 in (v_00, v_01,
>> ...)",
>> Drill will evaluate it as :
>>     c0 = v_00  OR c0 = v_01 OR ...
>>
>> Each reference of "c0" will lead to initialization of vector and holder
>> assignment in the generated code. There is redundant evaluation for
>> the common reference.
>>
>> I put together a patch,which will avoid the redundant evaluation for the
>> common reference.  Using TPCH scale factor 10's lineitem table, I saw
>> quite surprising improvement. (run on Mac with embedded drillbit)
>>
>> 1) In List uses integer type [2]
>>   master branch :  12.53 seconds
>>   patch on top of master branch : 7.073 seconds
>> That's almost 45% improvement.
>>
>> 2) In List uses binary type [3]
>>   master branch :  198.668 seconds
>> patch on top of master branch: 20.37 seconds
>>
>> Two thoughts:
>> 1. Will code size impact Janino compiler optimization or jvm hotspot
>> optimization? Otherwise, it seems hard to explain the performance
>> difference of removing the redundant evaluation. That might imply
>> that the efficiency of run-time generated code may degrade with
>> more expressions in the query (?)
>>
>> 2. For In-List filter, it might make sense to create a Drill UDF. The
>> UDF will build a heap-based hashtable in setup, in a similar way
>> as what the join approach will do.
>>
>>  I'm going to open a JIRA to submit the patch for review, as I feel
>> it will benefit not only the in list filter, but also expressions with
>> common column references.
>>
>>
>> [1]
>>
>> https://mail-archives.apache.org/mod_mbox/drill-user/201508.mbox/%3CCAC-7oTym0Yzr2RmXhDPag6k41se-uTkWu0QC%3DMABb7s94DJ0BA%40mail.gmail.com%3E
>>
>> [2] https://gist.github.com/jinfengni/7f6df9ed7d2c761fed33
>>
>> [3]  https://gist.github.com/jinfengni/7460f6d250f0d00009ed
>>

Mime
View raw message