spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "paul mackles (JIRA)" <>
Subject [jira] [Created] (SPARK-23988) [Mesos] Improve handling of appResource in mesos dispatcher when using Docker
Date Sun, 15 Apr 2018 22:54:00 GMT
paul mackles created SPARK-23988:

             Summary: [Mesos] Improve handling of appResource in mesos dispatcher when using
                 Key: SPARK-23988
             Project: Spark
          Issue Type: Improvement
          Components: Mesos
    Affects Versions: 2.3.0, 2.2.1
            Reporter: paul mackles

Our organization makes heavy use of Docker containers when running Spark on Mesos. The images
we use for our containers include Spark along with all of the application dependencies. We
find this to be a great way to manage our artifacts.

When specifying the primary application jar (i.e. appResource), the mesos dispatcher insists
on adding it to the list of URIs for Mesos to fetch as part of launching the driver's container.
This leads to confusing behavior where paths such as:
 * file:///application.jar
 * local:/application.jar
 * /application.jar

wind up being fetched from the host where the driver is running. Obviously, this doesn't work
since all of the above examples are referencing the path of the jar on the container image

Here is an example that I used for testing:
spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master mesos://spark-dispatcher \
  --deploy-mode cluster \
  --conf spark.cores.max=4 \
  --conf spark.mesos.executor.docker.image=spark:2.2.1 \
  local:/usr/local/spark/examples/jars/spark-examples_2.11-2.2.1.jar 10{code}
The "spark:2.2.1" image contains an installation of spark under "/usr/local/spark". Notice
how we reference the appResource using the "local:/" scheme.

If you try the above with the current version of the mesos dispatcher, it will try to fetch the
path "/usr/local/spark/examples/jars/spark-examples_2.11-2.2.1.jar" from the host filesystem
where the driver's container is running. On our systems, this fails since we don't have spark
installed on the hosts. 

For the PR, all I did was modify the mesos dispatcher to not add the "appResource to the
list of URIs for Mesos to fetch if it uses the "local:/" scheme.

For now, I didn't change the behavior of absolute paths or the "file:/" scheme because I wanted
to leave some form for the old behavior in place for backwards compatibility. Anyone have
any opinions on whether these schemes should change as well?

The PR also includes support for using "spark-internal" with Mesos in cluster mode which is
something we need for another use-case. I can separate them if that makes more sense.


This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message