datafu-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eyal Allweil (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DATAFU-148) Setup Spark sub-project
Date Tue, 26 Feb 2019 14:46:00 GMT

    [ https://issues.apache.org/jira/browse/DATAFU-148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16778002#comment-16778002
] 

Eyal Allweil commented on DATAFU-148:
-------------------------------------

Ohad and I had some time to work on this, so we added the "scala-python bridge" to the [spark-tmp|https://github.com/apache/datafu/tree/spark-tmp/datafu-spark]
branch - [~russell.jurney], you can take it and try testing it out in pyspark, it should work.

Obviously we still need to add documentation, but I've put a rudimentary version in our README
which explains how to call the DataFu Scala API's from Pyspark.

I'll add instructions for how to call arbitrary Python code from Scala later - you can look
at the [test which does this|https://github.com/apache/datafu/blob/spark-tmp/datafu-spark/src/test/scala/datafu/spark/TestScalaPythonBridge.scala#L73]
for now.

> Setup Spark sub-project
> -----------------------
>
>                 Key: DATAFU-148
>                 URL: https://issues.apache.org/jira/browse/DATAFU-148
>             Project: DataFu
>          Issue Type: New Feature
>            Reporter: Eyal Allweil
>            Assignee: Eyal Allweil
>            Priority: Major
>         Attachments: patch.diff, patch.diff
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Create a skeleton Spark sub project for Spark code to be contributed to DataFu



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message