spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From gatorsmile <...@git.apache.org>
Subject [GitHub] spark pull request #19480: [SPARK-22226][SQL] splitExpression can create too...
Date Thu, 26 Oct 2017 18:51:36 GMT
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19480#discussion_r147233642
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
---
    @@ -801,10 +831,84 @@ class CodegenContext {
                |  ${makeSplitFunction(body)}
                |}
              """.stripMargin
    -        addNewFunction(name, code)
    +        addNewFunctionInternal(name, code, inlineToOuterClass = false)
           }
     
    -      foldFunctions(functions.map(name => s"$name(${arguments.map(_._2).mkString(",
")})"))
    +      // Here we store all the methods which have been added to the outer class.
    +      val outerClassFunctions = functions
    +        .filter(_.innerClassName.isEmpty)
    +        .map(_.functionName)
    +
    +      val innerClassFunctions = generateInnerClassesMethodsCalls(
    +        functions.filter(_.innerClassName.nonEmpty),
    +        func,
    +        arguments,
    +        returnType,
    +        makeSplitFunction,
    +        foldFunctions)
    +
    +      val argsString = arguments.map(_._2).mkString(", ")
    +      foldFunctions((outerClassFunctions ++ innerClassFunctions).map(
    +        name => s"$name($argsString)"))
    +    }
    +  }
    +
    +  /**
    +   * Here we handle all the methods which have been added to the inner classes and
    +   * not to the outer class.
    +   * Since they can be many, their direct invocation in the outer class adds many entries
    +   * to the outer class' constant pool. This can cause the constant pool to past JVM
limit.
    +   * Moreover, this can cause also the outer class method where all the invocations are
    +   * performed to grow beyond the 64k limit.
    +   * To avoid these problems, we group them and we call only the grouping methods in
the
    +   * outer class.
    +   *
    +   * @param functions a [[Seq]] of [[NewFunctionSpec]] defined in the inner classes
    +   * @param funcName the split function name base.
    +   * @param arguments the list of (type, name) of the arguments of the split function.
    +   * @param returnType the return type of the split function.
    +   * @param makeSplitFunction makes split function body, e.g. add preparation or cleanup.
    +   * @param foldFunctions folds the split function calls.
    +   * @return an [[Iterable]] containing the methods' invocations
    +   */
    +  private def generateInnerClassesMethodsCalls(
    +      functions: Seq[NewFunctionSpec],
    +      funcName: String,
    +      arguments: Seq[(String, String)],
    +      returnType: String,
    +      makeSplitFunction: String => String,
    +      foldFunctions: Seq[String] => String): Iterable[String] = {
    +    val innerClassToFunctions = mutable.ListMap.empty[(String, String), Seq[String]]
    --- End diff --
    
    Why not using `LinkedHashMap`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message