matthayes commented on issue #15: Add Spark functionality to DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#issuecomment-509865726
+1
I reviewed the recent code changes and these look good to me. I am able to build the JAR
via `assemble`.
However I did run into an issue running the tests that I'm still investigating. This may
be environmental.
```
> Task :datafu-spark:test
datafu.spark.TestScalaPythonBridge > pyfromscala.py FAILED
java.lang.RuntimeException at TestScalaPythonBridge.scala:72
datafu.spark.TestScalaPythonBridge > SparkDFUtilsBridge FAILED
java.lang.RuntimeException at TestScalaPythonBridge.scala:72
24 tests completed, 2 failed
```
This seems to be the relevant info about one of the failures (they both seem to be the
same issue):
```
datafu.spark.TestScalaPythonBridge > pyfromscala.py FAILED
java.lang.RuntimeException: python bridge error:
Python 3.6.0 (default, Jun 6 2018, 13:47:36)
[GCC 4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> >>> ... ... ... ... ... ... Traceback (most recent call last):
File "<stdin>", line 2, in <module>
NameError: name 'execfile' is not defined
at org.apache.spark.deploy.SparkPythonRunner.execFile(SparkPythonRunner.scala:135)
at org.apache.spark.deploy.SparkPythonRunner.runPyFile(SparkPythonRunner.scala:50)
at datafu.spark.ScalaPythonBridgeRunner.runPythonFile(ScalaPythonBridge.scala:65)
at datafu.spark.TestScalaPythonBridge$.getNewRunner(TestScalaPythonBridge.scala:41)
at datafu.spark.TestScalaPythonBridge.datafu$spark$TestScalaPythonBridge$$runner$lzycompute(TestScalaPythonBridge.scala:72)
at datafu.spark.TestScalaPythonBridge.datafu$spark$TestScalaPythonBridge$$runner(TestScalaPythonBridge.scala:72)
```
Any idea what's wrong?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
|