spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Graves (JIRA)" <>
Subject [jira] [Commented] (SPARK-5162) Python yarn-cluster mode
Date Tue, 24 Mar 2015 13:51:52 GMT


Thomas Graves commented on SPARK-5162:

So is there anything left on this jira to do?  It looks like SPARK-5479 was filed and SPARK-5173
was merged in.  

> Python yarn-cluster mode
> ------------------------
>                 Key: SPARK-5162
>                 URL:
>             Project: Spark
>          Issue Type: New Feature
>          Components: PySpark, YARN
>            Reporter: Dana Klassen
>              Labels: cluster, python, yarn
> Running pyspark in yarn is currently limited to ‘yarn-client’ mode. It would be great
to be able to submit python applications to the cluster and (just like java classes) have
the resource manager setup an AM on any node in the cluster. Does anyone know the issues blocking
this feature? I was snooping around with enabling python apps:
> Removing the logic stopping python and yarn-cluster from sparkSubmit.scala
> ...
>     // The following modes are not supported or applicable
>     (clusterManager, deployMode) match {
>       ...
>       case (_, CLUSTER) if args.isPython =>
>         printErrorAndExit("Cluster deploy mode is currently not supported for python
>       ...
>     }
> …
> and submitting application via:
> HADOOP_CONF_DIR={{insert conf dir}} ./bin/spark-submit --master yarn-cluster --num-executors
2  —-py-files {{insert location of egg here}} --executor-cores 1  ../tools/
> Everything looks to run alright, pythonRunner is picked up as main class, resources get
setup, yarn client gets launched but falls flat on its face:
> 2015-01-08 18:48:03,444 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
DEBUG: FAILED { {{redacted}}/.sparkStaging/application_1420594669313_4687/, 1420742868009,
FILE, null }, Resource {{redacted}}/.sparkStaging/application_1420594669313_4687/
changed on src filesystem (expected 1420742868009, was 1420742869284
> and
> 2015-01-08 18:48:03,446 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource {{redacted}}/.sparkStaging/application_1420594669313_4687/>/data/4/yarn/nm/usercache/klassen/filecache/11/
transitioned from DOWNLOADING to FAILED
> Tracked this down to the apache hadoop code( line 249) related to container
localization of files upon downloading. At this point thought it would be best to raise the
issue here and get input.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message