Return-Path: X-Original-To: apmail-spark-issues-archive@minotaur.apache.org Delivered-To: apmail-spark-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1C8DA17501 for ; Tue, 24 Mar 2015 13:51:53 +0000 (UTC) Received: (qmail 35640 invoked by uid 500); 24 Mar 2015 13:51:53 -0000 Delivered-To: apmail-spark-issues-archive@spark.apache.org Received: (qmail 35607 invoked by uid 500); 24 Mar 2015 13:51:53 -0000 Mailing-List: contact issues-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@spark.apache.org Received: (qmail 35594 invoked by uid 99); 24 Mar 2015 13:51:53 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Mar 2015 13:51:53 +0000 Date: Tue, 24 Mar 2015 13:51:52 +0000 (UTC) From: "Thomas Graves (JIRA)" To: issues@spark.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (SPARK-5162) Python yarn-cluster mode MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/SPARK-5162?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1437= 7887#comment-14377887 ]=20 Thomas Graves commented on SPARK-5162: -------------------------------------- So is there anything left on this jira to do? It looks like SPARK-5479 was= filed and SPARK-5173 was merged in. =20 > Python yarn-cluster mode > ------------------------ > > Key: SPARK-5162 > URL: https://issues.apache.org/jira/browse/SPARK-5162 > Project: Spark > Issue Type: New Feature > Components: PySpark, YARN > Reporter: Dana Klassen > Labels: cluster, python, yarn > > Running pyspark in yarn is currently limited to =E2=80=98yarn-client=E2= =80=99 mode. It would be great to be able to submit python applications to = the cluster and (just like java classes) have the resource manager setup an= AM on any node in the cluster. Does anyone know the issues blocking this f= eature? I was snooping around with enabling python apps: > Removing the logic stopping python and yarn-cluster from sparkSubmit.scal= a > ... > // The following modes are not supported or applicable > (clusterManager, deployMode) match { > ... > case (_, CLUSTER) if args.isPython =3D> > printErrorAndExit("Cluster deploy mode is currently not supported= for python applications.") > ... > } > =E2=80=A6 > and submitting application via: > HADOOP_CONF_DIR=3D{{insert conf dir}} ./bin/spark-submit --master yarn-cl= uster --num-executors 2 =E2=80=94-py-files {{insert location of egg here}}= --executor-cores 1 ../tools/canary.py > Everything looks to run alright, pythonRunner is picked up as main class,= resources get setup, yarn client gets launched but falls flat on its face: > 2015-01-08 18:48:03,444 INFO org.apache.hadoop.yarn.server.nodemanager.co= ntainermanager.localizer.ResourceLocalizationService: DEBUG: FAILED { {{red= acted}}/.sparkStaging/application_1420594669313_4687/canary.py, 14207428680= 09, FILE, null }, Resource {{redacted}}/.sparkStaging/application_142059466= 9313_4687/canary.py changed on src filesystem (expected 1420742868009, was = 1420742869284 > and > 2015-01-08 18:48:03,446 INFO org.apache.hadoop.yarn.server.nodemanager.co= ntainermanager.localizer.LocalizedResource: Resource {{redacted}}/.sparkSta= ging/application_1420594669313_4687/canary.py(->/data/4/yarn/nm/usercache/k= lassen/filecache/11/canary.py) transitioned from DOWNLOADING to FAILED > Tracked this down to the apache hadoop code(FSDownload.java line 249) rel= ated to container localization of files upon downloading. At this point tho= ught it would be best to raise the issue here and get input. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org For additional commands, e-mail: issues-help@spark.apache.org