spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Takuya Ueshin (JIRA)" <>
Subject [jira] [Updated] (SPARK-22221) Add User Documentation for Working with Arrow in Spark
Date Thu, 25 Jan 2018 01:17:00 GMT


Takuya Ueshin updated SPARK-22221:
    Target Version/s: 2.3.0

> Add User Documentation for Working with Arrow in Spark
> ------------------------------------------------------
>                 Key: SPARK-22221
>                 URL:
>             Project: Spark
>          Issue Type: Sub-task
>          Components: PySpark, SQL
>    Affects Versions: 2.3.0
>            Reporter: Bryan Cutler
>            Priority: Major
> There needs to be user facing documentation that will show how to enable/use Arrow with
Spark, what the user should expect, and describe any differences with similar existing functionality.
> A comment from Xiao Li on
> Given the users/applications contain the Timestamp in their Dataset and their processing
algorithms also need to have the codes based on the corresponding time-zone related assumptions.
> * For the new users/applications, they first enabled Arrow and later hit an Arrow bug?
Can they simply turn off spark.sql.execution.arrow.enable? If not, what should they do?
> * For the existing users/applications, they want to utilize Arrow for better performance.
Can they just turn on spark.sql.execution.arrow.enable? What should they do?
> Note Hopefully, the guides/solutions are user-friendly. That means, it must be very simple
to understand for most users.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message