drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From amansinha100 <...@git.apache.org>
Subject [GitHub] drill pull request #794: DRILL-5375: Nested loop join: return correct result...
Date Fri, 24 Mar 2017 19:30:20 GMT
Github user amansinha100 commented on a diff in the pull request:

    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinBatch.java
    @@ -214,26 +226,62 @@ private boolean hasMore(IterOutcome outcome) {
        * Method generates the runtime code needed for NLJ. Other than the setup method to
set the input and output value
    -   * vector references we implement two more methods
    -   * 1. emitLeft()  -> Project record from the left side
    -   * 2. emitRight() -> Project record from the right side (which is a hyper container)
    +   * vector references we implement three more methods
    +   * 1. doEval() -> Evaluates if record from left side matches record from the right
    +   * 2. emitLeft() -> Project record from the left side
    +   * 3. emitRight() -> Project record from the right side (which is a hyper container)
        * @return the runtime generated class that implements the NestedLoopJoin interface
    -   * @throws IOException
    -   * @throws ClassTransformationException
    -  private NestedLoopJoin setupWorker() throws IOException, ClassTransformationException
    -    final CodeGenerator<NestedLoopJoin> nLJCodeGenerator = CodeGenerator.get(NestedLoopJoin.TEMPLATE_DEFINITION,
context.getFunctionRegistry(), context.getOptions());
    +  private NestedLoopJoin setupWorker() throws IOException, ClassTransformationException,
SchemaChangeException {
    +    final CodeGenerator<NestedLoopJoin> nLJCodeGenerator = CodeGenerator.get(
    +        NestedLoopJoin.TEMPLATE_DEFINITION, context.getFunctionRegistry(), context.getOptions());
         // Uncomment out this line to debug the generated code.
     //    nLJCodeGenerator.saveCodeForDebugging(true);
         final ClassGenerator<NestedLoopJoin> nLJClassGenerator = nLJCodeGenerator.getRoot();
    +    // generate doEval
    +    final ErrorCollector collector = new ErrorCollectorImpl();
    +    /*
    +        Logical expression may contain fields from left and right batches. During code
generation (materialization)
    +        we need to indicate from which input field should be taken. Mapping sets can
work with only one input at a time.
    +        But non-equality expressions can be complex:
    +          select t1.c1, t2.c1, t2.c2 from t1 inner join t2 on t1.c1 between t2.c1 and
    +        or even contain self join which can not be transformed into filter since OR clause
is present
    +          select *from t1 inner join t2 on t1.c1 >= t2.c1 or t1.c3 <> t1.c4
    +        In this case logical expression can not be split according to input presence
(like during equality joins
    --- End diff --
    I am not sure the distinction is accurate.  Even for equality condition, you could have
in your example above the columns coming from the same table.  i.e instead of  ...OR t1.c3
<> t1.c4  one could have OR t1.c3 = t1.c4 which is a valid condition.   

If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.

View raw message