arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jacek Laskowski (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ARROW-288) Implement Arrow adapter for Spark Datasets
Date Thu, 22 Sep 2016 11:38:21 GMT

    [ https://issues.apache.org/jira/browse/ARROW-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513039#comment-15513039
] 

Jacek Laskowski commented on ARROW-288:
---------------------------------------

I've scheduled a [Spark/Scala meetup|http://www.meetup.com/WarsawScala/events/234156519/]
next week and found the issue that we could help with somehow. We've got no experience with
Arrow but quite fine with Spark SQL's Datasets.

Could you [~wesmckinn] or [~julienledem] describe the very small steps needed for the task?
They could also just be a subtasks of the "umbrella" task. Thanks.

> Implement Arrow adapter for Spark Datasets
> ------------------------------------------
>
>                 Key: ARROW-288
>                 URL: https://issues.apache.org/jira/browse/ARROW-288
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++, Java - Vectors
>            Reporter: Wes McKinney
>
> It would be valuable for applications that use Arrow to be able to 
> * Convert between Spark DataFrames/Datasets and Java Arrow vectors
> * Send / Receive Arrow record batches / Arrow file format RPCs to / from Spark 
> * Allow PySpark to use Arrow for messaging in UDF evaluation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message