spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Baker <dist...@acm.org>
Subject Problems with Pyspark + Dill tests
Date Thu, 19 Jun 2014 14:47:45 GMT
Hi. As part of my attempt to port Pyspark to Python 3, I've
re-applied, with modifications, Josh's old commit for using Dill with
Pyspark (as Dill already supports Python 3). Alas, I ran into an odd
problem that I could use some help with.

Josh's old commit;

https://github.com/JoshRosen/incubator-spark/commit/2ac8986f3009f0dc133b11d16887fc8ddb33c3d1

My Dill branch;

https://github.com/distobj/spark/tree/dill

(Note; I've been running this in a virtualenv into which I
pip-installed dill. I haven't yet figured out the new way to package
it in python/lib as was done for py4j)

So the problem is that run_tests is failing with this pickle.py error
on most of the tests (those using .cache() it seems, unsurprisingly);

    PicklingError: Can't pickle <type '_sre.SRE_Pattern'>: it's not
found as _sre.SRE_Pattern

What's odd is that the same doctests work fine when run from the shell.

TIA for any ideas...

Mime
View raw message