From reviews-return-656828-archive-asf-public=cust-asf.ponee.io@spark.apache.org Thu Jun 7 20:16:08 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 7AAF0180663 for ; Thu, 7 Jun 2018 20:16:08 +0200 (CEST) Received: (qmail 12173 invoked by uid 500); 7 Jun 2018 18:16:07 -0000 Mailing-List: contact reviews-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@spark.apache.org Received: (qmail 12155 invoked by uid 99); 7 Jun 2018 18:16:06 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Jun 2018 18:16:06 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id D1877DFA59; Thu, 7 Jun 2018 18:16:06 +0000 (UTC) From: tgravescs To: reviews@spark.apache.org Reply-To: reviews@spark.apache.org References: In-Reply-To: Subject: [GitHub] spark pull request #21468: [SPARK-22151] : PYTHONPATH not picked up from the... Content-Type: text/plain Message-Id: <20180607181606.D1877DFA59@git1-us-west.apache.org> Date: Thu, 7 Jun 2018 18:16:06 +0000 (UTC) Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/21468#discussion_r193842887 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -813,8 +813,14 @@ private[spark] class Client( if (pythonPath.nonEmpty) { val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath) .mkString(ApplicationConstants.CLASS_PATH_SEPARATOR) - env("PYTHONPATH") = pythonPathStr - sparkConf.setExecutorEnv("PYTHONPATH", pythonPathStr) + val newValue = --- End diff -- good questions - precedence: So right now you can work around this issue by exporting PYTHONPATH before you launch spark-submit, I think this is something that could just be in someone's env on the launcher box and might not be what you want in a yarn container. I would think that specifying explicit pythonpath via spark.yarn.appMasterEnv would take precedence over that since you explicitly configured. Now the second question is where that fails with the py-files and that one isn't as clear to me since like you said its explicitly specified. Maybe we do py-files then spark.yarn.appMasterEnv.PYTHONPATH and then last env PYTHONPATH. that is different from the way it is now though. thoughts? - agree this should not be reflected in the executors so if it is we shouldn't do that. We should make sure the spark. executorEnv.PYTHONPATH works --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org For additional commands, e-mail: reviews-help@spark.apache.org