Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 59C982009DC for ; Tue, 2 May 2017 12:50:11 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 5861F160BAC; Tue, 2 May 2017 10:50:11 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 9CDC0160BA1 for ; Tue, 2 May 2017 12:50:10 +0200 (CEST) Received: (qmail 66032 invoked by uid 500); 2 May 2017 10:50:09 -0000 Mailing-List: contact issues-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@spark.apache.org Received: (qmail 66023 invoked by uid 99); 2 May 2017 10:50:09 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 May 2017 10:50:09 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 70413C04EF for ; Tue, 2 May 2017 10:50:09 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.501 X-Spam-Level: X-Spam-Status: No, score=-99.501 tagged_above=-999 required=6.31 tests=[KAM_NUMSUBJECT=0.5, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id LkJPcT4dUMAZ for ; Tue, 2 May 2017 10:50:07 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 520E85FB29 for ; Tue, 2 May 2017 10:50:06 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 7B31AE0D19 for ; Tue, 2 May 2017 10:50:05 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 31E8A21DEA for ; Tue, 2 May 2017 10:50:04 +0000 (UTC) Date: Tue, 2 May 2017 10:50:04 +0000 (UTC) From: "Hyukjin Kwon (JIRA)" To: issues@spark.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (SPARK-19019) PySpark does not work with Python 3.6.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 02 May 2017 10:50:11 -0000 [ https://issues.apache.org/jira/browse/SPARK-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992699#comment-15992699 ] Hyukjin Kwon commented on SPARK-19019: -------------------------------------- To solve this problem fully, I had to port cloudpickle change too in the PR. Only fixing hijected one described above dose not fully solve this issue. Please refer the discussion in the PR and the change. > PySpark does not work with Python 3.6.0 > --------------------------------------- > > Key: SPARK-19019 > URL: https://issues.apache.org/jira/browse/SPARK-19019 > Project: Spark > Issue Type: Bug > Components: PySpark > Reporter: Hyukjin Kwon > Assignee: Hyukjin Kwon > Priority: Critical > Fix For: 1.6.4, 2.0.3, 2.1.1, 2.2.0 > > > Currently, PySpark does not work with Python 3.6.0. > Running {{./bin/pyspark}} simply throws the error as below: > {code} > Traceback (most recent call last): > File ".../spark/python/pyspark/shell.py", line 30, in > import pyspark > File ".../spark/python/pyspark/__init__.py", line 46, in > from pyspark.context import SparkContext > File ".../spark/python/pyspark/context.py", line 36, in > from pyspark.java_gateway import launch_gateway > File ".../spark/python/pyspark/java_gateway.py", line 31, in > from py4j.java_gateway import java_import, JavaGateway, GatewayClient > File "", line 961, in _find_and_load > File "", line 950, in _find_and_load_unlocked > File "", line 646, in _load_unlocked > File "", line 616, in _load_backward_compatible > File ".../spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 18, in > File "/usr/local/Cellar/python3/3.6.0/Frameworks/Python.framework/Versions/3.6/lib/python3.6/pydoc.py", line 62, in > import pkgutil > File "/usr/local/Cellar/python3/3.6.0/Frameworks/Python.framework/Versions/3.6/lib/python3.6/pkgutil.py", line 22, in > ModuleInfo = namedtuple('ModuleInfo', 'module_finder name ispkg') > File ".../spark/python/pyspark/serializers.py", line 394, in namedtuple > cls = _old_namedtuple(*args, **kwargs) > TypeError: namedtuple() missing 3 required keyword-only arguments: 'verbose', 'rename', and 'module' > {code} > The problem is in https://github.com/apache/spark/blob/3c68944b229aaaeeaee3efcbae3e3be9a2914855/python/pyspark/serializers.py#L386-L394 as the error says and the cause seems because the arguments of {{namedtuple}} are now completely keyword-only arguments from Python 3.6.0 (See https://bugs.python.org/issue25628). > We currently copy this function via {{types.FunctionType}} which does not set the default values of keyword-only arguments (meaning {{namedtuple.__kwdefaults__}}) and this seems causing internally missing values in the function (non-bound arguments). > This ends up as below: > {code} > import types > import collections > def _copy_func(f): > return types.FunctionType(f.__code__, f.__globals__, f.__name__, > f.__defaults__, f.__closure__) > _old_namedtuple = _copy_func(collections.namedtuple) > _old_namedtuple(, "b") > _old_namedtuple("a") > {code} > If we call as below: > {code} > >>> _old_namedtuple("a", "b") > Traceback (most recent call last): > File "", line 1, in > TypeError: namedtuple() missing 3 required keyword-only arguments: 'verbose', 'rename', and 'module' > {code} > It throws an exception as above becuase {{__kwdefaults__}} for required keyword arguments seem unset in the copied function. So, if we give explicit value for these, > {code} > >>> _old_namedtuple("a", "b", verbose=False, rename=False, module=None) > > {code} > It works fine. > It seems now we should properly set these into the hijected one. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org For additional commands, e-mail: issues-help@spark.apache.org