spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From cloud-fan <...@git.apache.org>
Subject [GitHub] spark pull request #18931: [SPARK-21717][SQL] Decouple consume functions of ...
Date Tue, 23 Jan 2018 17:19:09 GMT
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18931#discussion_r163315164
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala
---
    @@ -149,13 +149,100 @@ trait CodegenSupport extends SparkPlan {
     
         ctx.freshNamePrefix = parent.variablePrefix
         val evaluated = evaluateRequiredVariables(output, inputVars, parent.usedInputs)
    +
    +    // Under certain conditions, we can put the logic to consume the rows of this operator
into
    +    // another function. So we can prevent a generated function too long to be optimized
by JIT.
    +    // The conditions:
    +    // 1. The parent uses all variables in output. we can't defer variable evaluation
when consume
    +    //    in another function.
    +    // 2. The output variables are not empty. If it's empty, we don't bother to do that.
    +    // 3. We don't use row variable. The construction of row uses deferred variable evaluation.
We
    +    //    can't do it.
    +    // 4. The number of output variables must less than maximum number of parameters
in Java method
    +    //    declaration.
    +    val requireAllOutput = output.forall(parent.usedInputs.contains(_))
    +    val consumeFunc =
    +      if (row == null && outputVars.nonEmpty && requireAllOutput &&
isValidParamLength(ctx)) {
    +        constructDoConsumeFunction(ctx, inputVars)
    +      } else {
    +        parent.doConsume(ctx, inputVars, rowVar)
    +      }
         s"""
            |${ctx.registerComment(s"CONSUME: ${parent.simpleString}")}
            |$evaluated
    -       |${parent.doConsume(ctx, inputVars, rowVar)}
    +       |$consumeFunc
          """.stripMargin
       }
     
    +  /**
    +   * In Java, a method descriptor is valid only if it represents method parameters with
a total
    +   * length of 255 or less. `this` contributes one unit and a parameter of type long
or double
    +   * contributes two units. Besides, for nullable parameters, we also need to pass a
boolean
    +   * for the null status.
    +   */
    +  private def isValidParamLength(ctx: CodegenContext): Boolean = {
    --- End diff --
    
    shall we put it into `CodegenContext` as a util function so that we can use it in other
places in the future?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message