datafu-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Russell Jurney (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (DATAFU-148) Setup Spark sub-project
Date Tue, 12 Mar 2019 02:02:00 GMT

    [ https://issues.apache.org/jira/browse/DATAFU-148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790113#comment-16790113
] 

Russell Jurney edited comment on DATAFU-148 at 3/12/19 2:01 AM:
----------------------------------------------------------------

I appreciate your work on this and I'm trying to make it work in Python 3. There was a print
statement that crashed previously. I'm using findspark to see if the files will run locally
in Python from the shell. The pyspark_utils all do.

I got it to run using Python 3.6.7 on Ubuntu 17.10. This is great! I'll do some more playing
around and then dive into the code before committing it. The following issues were discovered:
 * There was a 2.7 print statement in init_spark_context.py. Can you wrap it in parenthesis?
 * There is no note about running tests for datafu-spark . in the master README.md
 * Can you remove the '...' from the README? People are going to paste this code and it makes
it hard.
 * exporting PYTHONPATH did not work for me, I had to run --jars and --conf as in the README.
I don't know if this is a DataFu issue or a spark issue.

I want to thank you for breathing life back into DataFu. With Spark the project can continue
to thrive! This is something I meant to do for years and never got round to. Thanks again!


was (Author: russell.jurney):
I appreciate your work on this and I'm trying to make it work in Python 3. There was a print
statement that crashed previously. I'm using findspark to see if the files will run locally
in Python from the shell. The pyspark_utils all do.

I got it to run using Python 3.6.7 on Ubuntu 17.10. This is great! I'll do some more playing
around and then dive into the code before committing it. The following issues were discovered:
 * There was a 2.7 print statement in init_spark_context.py. Can you wrap it in quotes?
 * There is no note about running tests for datafu-spark . in the master README.md
 * Can you remove the '...' from the README? People are going to paste this code and it makes
it hard.
 * exporting PYTHONPATH did not work for me, I had to run --jars and --conf as in the README.
I don't know if this is a DataFu issue or a spark issue.

I want to thank you for breathing life back into DataFu. With Spark the project can continue
to thrive! This is something I meant to do for years and never got round to. Thanks again!

> Setup Spark sub-project
> -----------------------
>
>                 Key: DATAFU-148
>                 URL: https://issues.apache.org/jira/browse/DATAFU-148
>             Project: DataFu
>          Issue Type: New Feature
>            Reporter: Eyal Allweil
>            Assignee: Eyal Allweil
>            Priority: Major
>         Attachments: patch.diff, patch.diff
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Create a skeleton Spark sub project for Spark code to be contributed to DataFu



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message