spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kiszk <...@git.apache.org>
Subject [GitHub] spark pull request #22187: [SPARK-25178][SQL] change the generated code of t...
Date Wed, 22 Aug 2018 18:22:10 GMT
GitHub user kiszk opened a pull request:

    https://github.com/apache/spark/pull/22187

    [SPARK-25178][SQL] change the generated code of the keySchema / valueSchema for xxxHashMapGenerator

    ## What changes were proposed in this pull request?
    
    This PR generates the code that to refer a `StructType` generated in the scala code instead
of generating `StructType` in Java code. This solves two issues.
    1. Avoid to used the field name such as `key.name`
    1. Support complicated schema (e.g. nested DataType)
    
    Originally, [the JIRA entry](https://issues.apache.org/jira/browse/SPARK-25178) proposed
to change the generated field name of the keySchema / valueSchema to a dummy name in `RowBasedHashMapGenerator`
and `VectorizedHashMapGenerator.scala`. @Ueshin suggested to refer to a `StructType` generated
in the scala code using `ctx.addReferenceObj()`.
    
    ## How was this patch tested?
    
    Existing UTs

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kiszk/spark SPARK-25178

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22187.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22187
    
----
commit 0626de74f726622ac3eb251fc9f66aaa3de002d3
Author: Kazuaki Ishizaki <ishizaki@...>
Date:   2018-08-22T18:10:24Z

    initial commit

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message