spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From HyukjinKwon <...@git.apache.org>
Subject [GitHub] spark issue #18647: [SPARK-21789][PYTHON] Remove obsolete codes for parsing ...
Date Fri, 01 Sep 2017 04:07:23 GMT
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/18647
  
    I double checked these **`_split_schema_abstract`**, **`_parse_field_abstract`**, **`_parse_schema_abstract`**
and **`_infer_schema_type`** are not used in a public API.
    
    Under `./python/pyspark`:
    
    **1. `_split_schema_abstract`**:
    
    ```
    $ grep -r "_split_schema_abstract" .
    ```
    
    shows
    
    ```
    ./sql/types.py:def _split_schema_abstract(s):
    ./sql/types.py:    >>> _split_schema_abstract("a b  c")
    ./sql/types.py:    >>> _split_schema_abstract("a(a b)")
    ./sql/types.py:    >>> _split_schema_abstract("a b[] c{a b}")
    ./sql/types.py:    >>> _split_schema_abstract(" ")
    ./sql/types.py:    parts = _split_schema_abstract(s)
    ```
    
    Non doctests / tests:
    
    ```
    ./sql/types.py:    parts = _split_schema_abstract(s)
    ```
    
    This is within **3. `_parse_schema_abstract`**:
    
    
    https://github.com/apache/spark/blob/b56f79cc359d093d757af83171175cfd933162d1/python/pyspark/sql/types.py#L1274
    
    
    **2. `_parse_field_abstract`**:
    
    ```
    $ grep -r "_parse_field_abstract" .
    ```
    
    shows
    
    ```
    ./sql/types.py:def _parse_field_abstract(s):
    ./sql/types.py:    >>> _parse_field_abstract("a")
    ./sql/types.py:    >>> _parse_field_abstract("b(c d)")
    ./sql/types.py:    >>> _parse_field_abstract("a[]")
    ./sql/types.py:    >>> _parse_field_abstract("a{[]}")
    ./sql/types.py:    fields = [_parse_field_abstract(p) for p in parts]
    ```
    
    Non doctests / tests:
    
    ```
    fields = [_parse_field_abstract(p) for p in parts]
    ```
    
    This is within **3. `_parse_schema_abstract`**:
    
    https://github.com/apache/spark/blob/b56f79cc359d093d757af83171175cfd933162d1/python/pyspark/sql/types.py#L1275
    
    
    **3. `_parse_schema_abstract`**:
    
    ```
    $ grep -r "_parse_schema_abstract" .
    ```
    
    shows
    
    ```
    ./sql/tests.py:        from pyspark.sql.types import _parse_schema_abstract, _infer_schema_type
    ./sql/tests.py:        schema = _parse_schema_abstract(abstract)
    ./sql/types.py:        return StructField(name, _parse_schema_abstract(s[idx:]), True)
    ./sql/types.py:def _parse_schema_abstract(s):
    ./sql/types.py:    >>> _parse_schema_abstract("a b  c")
    ./sql/types.py:    >>> _parse_schema_abstract("a[b c] b{}")
    ./sql/types.py:    >>> _parse_schema_abstract("c{} d{a b}")
    ./sql/types.py:    >>> _parse_schema_abstract("a b(t)").fields[1]
    ./sql/types.py:        return _parse_schema_abstract(s[1:-1])
    ./sql/types.py:        return ArrayType(_parse_schema_abstract(s[1:-1]), True)
    ./sql/types.py:        return MapType(NullType(), _parse_schema_abstract(s[1:-1]))
    ./sql/types.py:    >>> schema = _parse_schema_abstract("a b c d")
    ./sql/types.py:    >>> schema = _parse_schema_abstract("a[] b{c d}")
    ```
    
    Non doctests / tests:
    
    ```
    ./sql/types.py:        return StructField(name, _parse_schema_abstract(s[idx:]), True)
    ./sql/types.py:        return _parse_schema_abstract(s[1:-1])
    ./sql/types.py:        return ArrayType(_parse_schema_abstract(s[1:-1]), True)
    ./sql/types.py:        return MapType(NullType(), _parse_schema_abstract(s[1:-1]))
    ```
    
    These four are within **`2. _parse_field_abstract`** and within **`3. _parse_schema_abstract`**:
    
    https://github.com/apache/spark/blob/b56f79cc359d093d757af83171175cfd933162d1/python/pyspark/sql/types.py#L1243
    
    https://github.com/apache/spark/blob/b56f79cc359d093d757af83171175cfd933162d1/python/pyspark/sql/types.py#L1266
    
    https://github.com/apache/spark/blob/b56f79cc359d093d757af83171175cfd933162d1/python/pyspark/sql/types.py#L1269
    
    https://github.com/apache/spark/blob/b56f79cc359d093d757af83171175cfd933162d1/python/pyspark/sql/types.py#L1272
    
    
    **4. `_infer_schema_type`**:
    
    ```
    $ grep -r "_infer_schema_type"
    ```
    
    shows
    
    ```
    ./sql/tests.py:        from pyspark.sql.types import _parse_schema_abstract, _infer_schema_type
    ./sql/tests.py:        typedSchema = _infer_schema_type(rdd.first(), schema)
    ./sql/types.py:def _infer_schema_type(obj, dataType):
    ./sql/types.py:    >>> _infer_schema_type(row, schema)
    ./sql/types.py:    >>> _infer_schema_type(row, schema)
    ./sql/types.py:        eType = _infer_schema_type(obj[0], dataType.elementType)
    ./sql/types.py:        return MapType(_infer_schema_type(k, dataType.keyType),
    ./sql/types.py:                       _infer_schema_type(v, dataType.valueType))
    ./sql/types.py:        fields = [StructField(f.name, _infer_schema_type(o, f.dataType),
True)
    ```
    
    Non doctests / tests:
    
    ```
    ./sql/types.py:        eType = _infer_schema_type(obj[0], dataType.elementType)
    ./sql/types.py:        return MapType(_infer_schema_type(k, dataType.keyType),
    ./sql/types.py:                       _infer_schema_type(v, dataType.valueType))
    ./sql/types.py:        fields = [StructField(f.name, _infer_schema_type(o, f.dataType),
True)
    
    ```
    
    These four are within **4. `_infer_schema_type`**:
    
    https://github.com/apache/spark/blob/b56f79cc359d093d757af83171175cfd933162d1/python/pyspark/sql/types.py#L1299
    
    https://github.com/apache/spark/blob/b56f79cc359d093d757af83171175cfd933162d1/python/pyspark/sql/types.py#L1304
    
    https://github.com/apache/spark/blob/b56f79cc359d093d757af83171175cfd933162d1/python/pyspark/sql/types.py#L1305
    
    https://github.com/apache/spark/blob/b56f79cc359d093d757af83171175cfd933162d1/python/pyspark/sql/types.py#L1311
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message