drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jinfeng Ni (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-5586) UnionAll operator does more than necessary value vector allocation and copy
Date Wed, 14 Jun 2017 00:00:00 GMT
Jinfeng Ni created DRILL-5586:
---------------------------------

             Summary: UnionAll operator does more than necessary value vector allocation and
copy
                 Key: DRILL-5586
                 URL: https://issues.apache.org/jira/browse/DRILL-5586
             Project: Apache Drill
          Issue Type: Bug
            Reporter: Jinfeng Ni


When inputs to UnionAll operators are just simple field reference, in stead of an expression
involving a function, which requires evaluation, it should leverage value vector's transfer
API.  Doing transfer would avoid the allocation of buffer for value vector in outgoing batch,
plus the overhead to copy the data from incoming batch to outgoing batch. 

For example, in the following query:
{code}
select l_orderkey from cp.`tpch/lineitem.parquet` l union all select n_nationkey from cp.`tpch/nation.parquet`
{code}

Both left and right side of UnionAll operator is simple filed reference, and Drill should
call transfer API. However, the current code would do buffer allocation & copy for both
left and right. Such processing would significantly slow UnionAll operator's performance,
and eventually slow down query evaluation.

DRILL-5521 reverts a change in logic whether applying transfer logic made in DRILL-5419, based
on SchemaPath equal comparison.  Even we fix that problem, it's not enough to use SchemaPath
equal comparison as criteria whether transfer should be used. Ideally, even the output field
and incoming field have different names, UnionAll operator should do {{transfer}}, instead
of {{copy}}, as long as the expression is simple field reference. 

{code}
select l_orderkey as Key1 from cp.`tpch/lineitem.parquet` l union all select n_nationkey as
Key2 from cp.`tpch/nation.parquet`
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message