spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Nazario <mnaza...@palantir.com>
Subject RE: Spark 1.4.0 pyspark and pylint breaking
Date Wed, 27 May 2015 20:41:54 GMT
I've done some investigation into what work needed to be done to keep the _types module named
types. This isn't a relative / absolute path problem, but actually a problem with the way
the tests were run.

I've filed a jira ticket on it here: https://issues.apache.org/jira/browse/SPARK-7899)

I also have a pull request for fixing this here: https://github.com/apache/spark/pull/6439

Michael
________________________________________
From: Davies Liu [davies@databricks.com]
Sent: Tuesday, May 26, 2015 4:18 PM
To: Punyashloka Biswal
Cc: Justin Uang; dev@spark.apache.org
Subject: Re: Spark 1.4.0 pyspark and pylint breaking

I think relative imports can not help in this case.

When you run scripts in pyspark/sql, it doesn't know anything about
pyspark.sql, it
just see types.py as a separate module.

On Tue, May 26, 2015 at 12:44 PM, Punyashloka Biswal
<punya.biswal@gmail.com> wrote:
> Davies: Can we use relative imports (import .types) in the unit tests in
> order to disambiguate between the global and local module?
>
> Punya
>
> On Tue, May 26, 2015 at 3:09 PM Justin Uang <justin.uang@gmail.com> wrote:
>>
>> Thanks for clarifying! I don't understand python package and modules names
>> that well, but I thought that the package namespacing would've helped, since
>> you are in pyspark.sql.types. I guess not?
>>
>> On Tue, May 26, 2015 at 3:03 PM Davies Liu <davies@databricks.com> wrote:
>>>
>>> There is a module called 'types' in python 3:
>>>
>>> davies@localhost:~/work/spark$ python3
>>> Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 00:54:21)
>>> [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
>>> Type "help", "copyright", "credits" or "license" for more information.
>>> >>> import types
>>> >>> types
>>> <module 'types' from
>>>
>>> '/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/types.py'>
>>>
>>> Without renaming, our `types.py` will conflict with it when you run
>>> unittests in pyspark/sql/ .
>>>
>>> On Tue, May 26, 2015 at 11:57 AM, Justin Uang <justin.uang@gmail.com>
>>> wrote:
>>> > In commit 04e44b37, the migration to Python 3, pyspark/sql/types.py was
>>> > renamed to pyspark/sql/_types.py and then some magic in
>>> > pyspark/sql/__init__.py dynamically renamed the module back to types. I
>>> > imagine that this is some naming conflict with Python 3, but what was
>>> > the
>>> > error that showed up?
>>> >
>>> > The reason why I'm asking about this is because it's messing with
>>> > pylint,
>>> > since pylint cannot now statically find the module. I tried also
>>> > importing
>>> > the package so that __init__ would be run in a init-hook, but that
>>> > isn't
>>> > what the discovery mechanism is using. I imagine it's probably just
>>> > crawling
>>> > the directory structure.
>>> >
>>> > One way to work around this would be something akin to this
>>> >
>>> > (https://urldefense.proofpoint.com/v2/url?u=http-3A__stackoverflow.com_questions_9602811_how-2Dto-2Dtell-2Dpylint-2Dto-2Dignore-2Dcertain-2Dimports&d=BQIBaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=yN4Yj1JskMkGMKoYoLUUIQViRLGShPc1wislP1YdU4g&m=8-Bnuaq-HaKXNQYsouzyQuyrj1GH9MbO6JQWXBMqa_Q&s=uireqIdh4TOSVaj4QM0tNIbPIWKQ_sFQE-M32_3Q-ek&e=
),
>>> > where I would have to create a fake module, but I would probably be
>>> > missing
>>> > a ton of pylint features on users of that module, and it's pretty
>>> > hacky.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message