spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maciej Szymkiewicz <mszymkiew...@gmail.com>
Subject Re: [PYTHON] PySpark typing hints
Date Tue, 23 May 2017 13:45:49 GMT


On 05/23/2017 02:45 PM, Mendelson, Assaf wrote:
>
> You are correct,
>
> I actually did not look too deeply into it until now as I noticed you
> mentioned it is compatible with python 3 only and I saw in the github
> that mypy or pytype is required.
>
>  
>
> Because of that I made my suggestions with the thought of python 2.
>
>  
>
> Looking into it more deeply, I am wondering what is not supported? Are
> you talking about limitation for testing?
>

Since type checkers (unlike annotations) are not standardized, this
varies between projects and versions. For MyPy quite a lot changed since
I started annotating Spark.

Few months ago I wouldn't even bother looking at the list of issues,
today (as mentioned in the other message) we could remove metaclasses,
and pass both Python 2 and Python 3 checks.

The other part is typing module itself, as well as function annotations
(outside docstrings). But this is not a problem with stub files.
>
>  
>
> If I understand correctly then one can use this without any issues for
> pycharm (and other IDEs supporting the type hinting) even when
> developing for python 2.
>

This strictly depends on type checker. I didn't follow the development,
but I got this impression that a lot changed for example between PyCharm
2016.3 and 2017.1. I think that the important point is that lack of
support, doesn't break anything.
>
> In addition, the tests can test the existing pyspark, they just have
> to be run with a compatible packaging (e.g. mypy).
>
> Meaning that porting for python 2 would provide a very small advantage
> over the immediate advantages (IDE usage and testing for most cases).
>
>  
>
> Am I missing something?
>
>  
>
> Thanks,
>
>               Assaf.
>
>  
>
> *From:*Maciej Szymkiewicz [mailto:mszymkiewicz@gmail.com]
> *Sent:* Tuesday, May 23, 2017 3:27 PM
> *To:* Mendelson, Assaf
> *Subject:* Re: [PYTHON] PySpark typing hints
>
>  
>
>  
>
>  
>
> On 05/23/2017 01:12 PM, assaf.mendelson wrote:
>
>     That said, If we make a decision on the way to handle it then I
>     believe it would be a good idea to start even with the bare
>     minimum and continue to add to it (and therefore make it so many
>     people can contribute). The code I added in github were basically
>     the things I needed.
>
> I already have almost full coverage of the API, excluding some exotic
> part of the legacy streaming, so starting with bare minimum is not
> really required.
>
> The advantage of the first is that it is part of the code which means
> it is easier to make it updated. The main issue with this is that
> supporting auto generated code (as is the case in most functions) can
> be a little awkward and actually is a relate to a separate issue as it
> means pycharm marks most of the functions as an error (i.e.
> pyspark.sql.functions.XXX is marked as not thereā€¦)
>
>
> Comment based annotations are not suitable for complex signatures with
> multliversion support.
>
> Also there is no support for overloading, therefore it is not possible
> to capture relationship between arguments, and arguments and return type.
>

-- 
Maciej Szymkiewicz


Mime
View raw message