spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Davies Liu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-5973) zip two rdd with AutoBatchedSerializer will fail
Date Tue, 24 Feb 2015 21:46:04 GMT

     [ https://issues.apache.org/jira/browse/SPARK-5973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Davies Liu updated SPARK-5973:
------------------------------
    Description: 
zip two rdd with AutoBatchedSerializer will fail, this bug was introduced by SPARK-4841

{code}
>> a.zip(b).count()
15/02/24 12:11:56 ERROR PythonRDD: Python worker exited unexpectedly (crashed)
org.apache.spark.api.python.PythonException: Traceback (most recent call last):
  File "/Users/davies/work/spark/python/pyspark/worker.py", line 101, in main
    process()
  File "/Users/davies/work/spark/python/pyspark/worker.py", line 96, in process
    serializer.dump_stream(func(split_index, iterator), outfile)
  File "/Users/davies/work/spark/python/pyspark/rdd.py", line 2249, in pipeline_func
    return func(split, prev_func(split, iterator))
  File "/Users/davies/work/spark/python/pyspark/rdd.py", line 2249, in pipeline_func
    return func(split, prev_func(split, iterator))
  File "/Users/davies/work/spark/python/pyspark/rdd.py", line 270, in func
    return f(iterator)
  File "/Users/davies/work/spark/python/pyspark/rdd.py", line 933, in <lambda>
    return self.mapPartitions(lambda i: [sum(1 for _ in i)]).sum()
  File "/Users/davies/work/spark/python/pyspark/rdd.py", line 933, in <genexpr>
    return self.mapPartitions(lambda i: [sum(1 for _ in i)]).sum()
  File "/Users/davies/work/spark/python/pyspark/serializers.py", line 306, in load_stream
    " in pair: (%d, %d)" % (len(keys), len(vals)))
ValueError: Can not deserialize RDD with different number of items in pair: (123, 64)
{code}

  was:zip two rdd with AutoBatchedSerializer will fail, this bug was introduced by SPARK-4841


> zip two rdd with AutoBatchedSerializer will fail
> ------------------------------------------------
>
>                 Key: SPARK-5973
>                 URL: https://issues.apache.org/jira/browse/SPARK-5973
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 1.3.0, 1.2.1
>            Reporter: Davies Liu
>            Priority: Blocker
>
> zip two rdd with AutoBatchedSerializer will fail, this bug was introduced by SPARK-4841
> {code}
> >> a.zip(b).count()
> 15/02/24 12:11:56 ERROR PythonRDD: Python worker exited unexpectedly (crashed)
> org.apache.spark.api.python.PythonException: Traceback (most recent call last):
>   File "/Users/davies/work/spark/python/pyspark/worker.py", line 101, in main
>     process()
>   File "/Users/davies/work/spark/python/pyspark/worker.py", line 96, in process
>     serializer.dump_stream(func(split_index, iterator), outfile)
>   File "/Users/davies/work/spark/python/pyspark/rdd.py", line 2249, in pipeline_func
>     return func(split, prev_func(split, iterator))
>   File "/Users/davies/work/spark/python/pyspark/rdd.py", line 2249, in pipeline_func
>     return func(split, prev_func(split, iterator))
>   File "/Users/davies/work/spark/python/pyspark/rdd.py", line 270, in func
>     return f(iterator)
>   File "/Users/davies/work/spark/python/pyspark/rdd.py", line 933, in <lambda>
>     return self.mapPartitions(lambda i: [sum(1 for _ in i)]).sum()
>   File "/Users/davies/work/spark/python/pyspark/rdd.py", line 933, in <genexpr>
>     return self.mapPartitions(lambda i: [sum(1 for _ in i)]).sum()
>   File "/Users/davies/work/spark/python/pyspark/serializers.py", line 306, in load_stream
>     " in pair: (%d, %d)" % (len(keys), len(vals)))
> ValueError: Can not deserialize RDD with different number of items in pair: (123, 64)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message