drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pritesh Maker (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (DRILL-5912) Hash Join Enhancement: Avoid copying probe side values
Date Fri, 27 Oct 2017 21:54:01 GMT

     [ https://issues.apache.org/jira/browse/DRILL-5912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Pritesh Maker reassigned DRILL-5912:
------------------------------------

    Assignee: Boaz Ben-Zvi

> Hash Join Enhancement: Avoid copying probe side values
> ------------------------------------------------------
>
>                 Key: DRILL-5912
>                 URL: https://issues.apache.org/jira/browse/DRILL-5912
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Execution - Relational Operators
>    Affects Versions: 1.11.0
>            Reporter: Boaz Ben-Zvi
>            Assignee: Boaz Ben-Zvi
>            Priority: Minor
>
> When the Hash Join Operator (inner, or left outer) performs the "probe and project" task,
it copies each probe side values to be projected. Example:
> {code}
>     public void projectProbeRecord(int probeIndex, int outIndex)
>         throws SchemaChangeException
>     {
>         {
>             vv15 .copyFromSafe((probeIndex), (outIndex), vv12);
>         }
>         {
>             vv21 .copyFromSafe((probeIndex), (outIndex), vv18);
>         }
>     }
> {code}
> In the case where there are no duplicate-key entries in the build side, and no spilling
took place, then each of the outer values is projected exactly once (for left outer), or at
most once (for inner join). 
> In such (common) cases, we could avoid the above copy, and just transfer the value vectors
as is (or add a Selection Vector 2 for the inner join, to eliminate the unmatched entries).
> This can be a significant performance enhancement, as copying each set of values is much
more expensive than transposing vectors (e.g., perform the copy 64K times, plus allocation
of the vectors, and possible resizing for variable sized types).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message