spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Holden Karau <hol...@pigscanfly.ca>
Subject Re: spark pypy support?
Date Mon, 14 Aug 2017 21:24:25 GMT
As Dong says yes we do test with PyPy in our CI env; but we expect a
"newer" version of PyPy (although I don't think we ever bothered to write
down what the exact version requirements are for the PyPy support unlike
regular Python).

On Mon, Aug 14, 2017 at 2:06 PM, Dong Joon Hyun <dhyun@hortonworks.com>
wrote:

> Hi, Tom.
>
>
>
> What version of PyPy do you use?
>
>
>
> In the Jenkins environment, `pypy` always passes like Python 2.7 and
> Python 3.4.
>
>
>
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%
> 20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.7/3340/consoleFull
>
>
>
> ========================================================================
>
> Running PySpark tests
>
> ========================================================================
>
> Running PySpark tests. Output is in /home/jenkins/workspace/spark-
> master-test-sbt-hadoop-2.7/python/unit-tests.log
>
> Will test against the following Python executables: ['python2.7',
> 'python3.4', 'pypy']
>
> Will test the following Python modules: ['pyspark-core', 'pyspark-ml',
> 'pyspark-mllib', 'pyspark-sql', 'pyspark-streaming']
>
> Starting test(python2.7): pyspark.mllib.tests
>
> Starting test(pypy): pyspark.sql.tests
>
> Starting test(pypy): pyspark.tests
>
> Starting test(pypy): pyspark.streaming.tests
>
> Finished test(pypy): pyspark.tests (181s)
>
> …
>
>
>
> Tests passed in 1130 seconds
>
>
>
>
>
> Bests,
>
> Dongjoon.
>
>
>
>
>
> *From: *Tom Graves <tgraves_cs@yahoo.com.INVALID>
> *Date: *Monday, August 14, 2017 at 1:55 PM
> *To: *"dev@spark.apache.org" <dev@spark.apache.org>
> *Subject: *spark pypy support?
>
>
>
> Anyone know if pypy works with spark. Saw a jira that it was supported
> back in Spark 1.2 but getting an error when trying and not sure if its
> something with my pypy version of just something spark doesn't support.
>
>
>
>
>
> AttributeError: 'builtin-code' object has no attribute 'co_filename'
> Traceback (most recent call last):
>   File "<builtin>/app_main.py", line 75, in run_toplevel
>   File "/homes/tgraves/mbe.py", line 40, in <module>
>     count = sc.parallelize(range(1, n + 1), partitions).map(f).reduce(add)
>   File "/home/gs/spark/latest/python/lib/pyspark.zip/pyspark/rdd.py",
> line 834, in reduce
>     vals = self.mapPartitions(func).collect()
>   File "/home/gs/spark/latest/python/lib/pyspark.zip/pyspark/rdd.py",
> line 808, in collect
>     port = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd())
>   File "/home/gs/spark/latest/python/lib/pyspark.zip/pyspark/rdd.py",
> line 2440, in _jrdd
>     self._jrdd_deserializer, profiler)
>   File "/home/gs/spark/latest/python/lib/pyspark.zip/pyspark/rdd.py",
> line 2373, in _wrap_function
>     pickled_command, broadcast_vars, env, includes =
> _prepare_for_python_RDD(sc, command)
>   File "/home/gs/spark/latest/python/lib/pyspark.zip/pyspark/rdd.py",
> line 2359, in _prepare_for_python_RDD
>     pickled_command = ser.dumps(command)
>   File "/home/gs/spark/latest/python/lib/pyspark.zip/pyspark/serializers.py",
> line 460, in dumps
>     return cloudpickle.dumps(obj, 2)
>   File "/home/gs/spark/latest/python/lib/pyspark.zip/pyspark/cloudpickle.py",
> line 703, in dumps
>     cp.dump(obj)
>   File "/home/gs/spark/latest/python/lib/pyspark.zip/pyspark/cloudpickle.py",
> line 160, in dump
>
>
>
> Thanks,
>
> Tom
>



-- 
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau

Mime
View raw message