drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-6202) Deprecate usage of IndexOutOfBoundsException to re-alloc vectors
Date Tue, 03 Apr 2018 10:43:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-6202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423828#comment-16423828

ASF GitHub Bot commented on DRILL-6202:

Github user parthchandra commented on the issue:

    I think we need to include a few other folks into this. @paul-rogers, @sachouche, have
also looked into the issue of excessive bounds checking and ways to write to direct memory
with minimum overhead.
    Both Salim and Paul have done work which has tried to eliminate the excessive checking
and use `PlatformDependent` directly, so it might be the right time to agree on the right
approach here. At a high level, I believe there is agreement that we need to 1) reduce bounds
checking to (preferably) once per write, and 2) to minimise the number of function calls before
memory is actually written to.
    We have three layers where we could potentially check bounds - i) the operators, ii) the
vectors, iii) DrillBuf. Right now, we do so at each level, at multiple times at that. Paul's
work on batch sizing provides us with a layer that gives us the bounds check guarantees at
the operator level. This means we could potentially use value-vectors' set methods (as opposed
to the setSafe methods) and DrillBuf can use PlatformDependent directly. 
    Some caveats - 
    UDLE's check for and enforce little-endianness. Checking for endianness is important for
value vectors because they assume little endian,  but the enforcement is sometimes not so
desirable. Drill's Java client uses the same DrillBuf backed by a UDLE and that means that
client applications can no longer run on big endian machines (and yes, I have heard this request
from end users). However, the fact is that UDLE's are an intrinsic part of the drill-memory
design [1] [2]. Eliminating UDLE's can lead to re-doing large parts of very well tested code.
    The caveat to using the vector set methods is that the setSafe methods provide resizing
capability that operators have come to rely upon. Switching from setSafe to set breaks practically
every operator. 
    [1] https://github.com/jacques-n/drill/blob/DRILL-4144/exec/memory/base/src/main/java/org/apache/drill/exec/memory/README.md
    [2] https://docs.google.com/nonceSigner?nonce=nj279efks0ro0&continue=https://doc-0o-as-docs.googleusercontent.com/docs/securesc/gipu3hlcf22l6svruqr71h7qe2k3djum/5v7eb2cm4bghq76nj658ai5hkk9h52ur/1522749600000/11021365158327727934/11021365158327727934/0B6CmYjIAywyCalVwcURkaFlkc1U?e%3Ddownload&hash=41l8jspccbj1pp63750c5von8ol4ijtl

> Deprecate usage of IndexOutOfBoundsException to re-alloc vectors
> ----------------------------------------------------------------
>                 Key: DRILL-6202
>                 URL: https://issues.apache.org/jira/browse/DRILL-6202
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Vlad Rozov
>            Assignee: Vlad Rozov
>            Priority: Major
>             Fix For: 1.14.0
> As bounds checking may be enabled or disabled, using IndexOutOfBoundsException to resize
vectors is unreliable. It works only when bounds checking is enabled.

This message was sent by Atlassian JIRA

View raw message