spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "holdenk (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-15589) Anaylze simple PySpark closures and generate SQL expressions
Date Fri, 27 May 2016 20:15:13 GMT

    [ https://issues.apache.org/jira/browse/SPARK-15589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15304692#comment-15304692
] 

holdenk commented on SPARK-15589:
---------------------------------

Of course needs to wait for the Python Dataset API to exist.

> Anaylze simple PySpark closures and generate SQL expressions
> ------------------------------------------------------------
>
>                 Key: SPARK-15589
>                 URL: https://issues.apache.org/jira/browse/SPARK-15589
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark, SQL
>            Reporter: holdenk
>
> Similar to SPARK-14083 we can try introspecting simple Python functions and see if we
can generate an equivalent SQL expression. This would result in an even greater performance
increase for PySpark users than Scala users as not only would they benefit from better codegen,
it would also avoid substantial serialization cost.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message