spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Annamalai Venugopal (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-22792) PySpark UDF registering issue
Date Fri, 15 Dec 2017 07:16:00 GMT
Annamalai Venugopal created SPARK-22792:
-------------------------------------------

             Summary: PySpark UDF registering issue
                 Key: SPARK-22792
                 URL: https://issues.apache.org/jira/browse/SPARK-22792
             Project: Spark
          Issue Type: Question
          Components: PySpark
    Affects Versions: 2.2.1
         Environment: Windows OS, Python pycharm ,Spark
            Reporter: Annamalai Venugopal
            Priority: Blocker
             Fix For: 2.2.1


I am doing a project with pyspark i am struck with an issue

Traceback (most recent call last):
  File "C:/Users/avenugopal/PycharmProjects/POC_for_vectors/main.py", line 187, in <module>
    hypernym_extracted_data = result.withColumn("hypernym_extracted_data", hypernym_fn(F.column("token_extracted_data")))
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\sql\functions.py",
line 1957, in wrapper
    return udf_obj(*args)
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\sql\functions.py",
line 1916, in __call__
    judf = self._judf
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\sql\functions.py",
line 1900, in _judf
    self._judf_placeholder = self._create_judf()
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\sql\functions.py",
line 1909, in _create_judf
    wrapped_func = _wrap_function(sc, self.func, self.returnType)
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\sql\functions.py",
line 1866, in _wrap_function
    pickled_command, broadcast_vars, env, includes = _prepare_for_python_RDD(sc, command)
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\rdd.py",
line 2374, in _prepare_for_python_RDD
    pickled_command = ser.dumps(command)
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\serializers.py",
line 460, in dumps
    return cloudpickle.dumps(obj, 2)
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\cloudpickle.py",
line 704, in dumps
    cp.dump(obj)
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\cloudpickle.py",
line 148, in dump
    return Pickler.dump(self, obj)
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", line 409,
in dump
    self.save(obj)
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", line 476,
in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", line 736,
in save_tuple
    save(element)
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", line 476,
in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\cloudpickle.py",
line 249, in save_function
    self.save_function_tuple(obj)
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\cloudpickle.py",
line 297, in save_function_tuple
    save(f_globals)
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", line 476,
in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", line 821,
in save_dict
    self._batch_setitems(obj.items())
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", line 852,
in _batch_setitems
    save(v)
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", line 476,
in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\cloudpickle.py",
line 249, in save_function
    self.save_function_tuple(obj)
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\cloudpickle.py",
line 297, in save_function_tuple
    save(f_globals)
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", line 476,
in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", line 821,
in save_dict
    self._batch_setitems(obj.items())
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", line 852,
in _batch_setitems
    save(v)
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", line 521,
in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\cloudpickle.py",
line 565, in save_reduce
    "args[0] from __newobj__ args has the wrong class")
_pickle.PicklingError: args[0] from __newobj__ args has the wrong class



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message