spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From majdou41 <>
Subject [GitHub] spark issue #17282: [SPARK-19872][PYTHON] Use the correct deserializer for R...
Date Fri, 09 Feb 2018 13:08:56 GMT
Github user majdou41 commented on the issue:
    My code is :+1: sc.binatyFiles('hdfs://localhost:9000/user/majdouline/Training').repartition(90).collect()
    and i got this error :+1:  UTF8Deserializer(True)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File ".../spark/python/pyspark/", line 811, in collect
        return list(_load_from_socket(port, self._jrdd_deserializer))
      File ".../spark/python/pyspark/", line 549, in load_stream
        yield self.loads(stream)
      File ".../spark/python/pyspark/", line 544, in loads
        return s.decode("utf-8") if self.use_unicode else s
      File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/encodings/",
line 16, in decode
        return codecs.utf_8_decode(input, errors, True)
    UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0: invalid start byte
    I had change and serializers (version 2.1.0 to 2.0.2), but i got the same error

    Can you help me please to fixe that .


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message