spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-21618) http(s) not accepted in spark-submit jar uri
Date Thu, 03 Aug 2017 09:53:01 GMT

    [ https://issues.apache.org/jira/browse/SPARK-21618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16112511#comment-16112511
] 

Steve Loughran commented on SPARK-21618:
----------------------------------------

It may depend on HADOOP-14383; I wouldn't recommend rushing to that as it breaks azure &
spark on Hadoop 3 alphas: HADOOP-14598

> http(s) not accepted in spark-submit jar uri
> --------------------------------------------
>
>                 Key: SPARK-21618
>                 URL: https://issues.apache.org/jira/browse/SPARK-21618
>             Project: Spark
>          Issue Type: Bug
>          Components: Deploy
>    Affects Versions: 2.1.1, 2.2.0
>         Environment: pre-built for hadoop 2.6 and 2.7 on mac and ubuntu 16.04. 
>            Reporter: Ben Mayne
>            Priority: Minor
>              Labels: documentation
>
> The documentation suggests I should be able to use an http(s) uri for a jar in spark-submit,
but I haven't been successful https://spark.apache.org/docs/latest/submitting-applications.html#advanced-dependency-management
> {noformat}
> benmayne@Benjamins-MacBook-Pro ~ $ spark-submit --deploy-mode client --master local[2]
--class class.name.Test https://test.com/path/to/jar.jar
> log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
> Exception in thread "main" java.io.IOException: No FileSystem for scheme: https
> 	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2586)
> 	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2593)
> 	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
> 	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2632)
> 	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2614)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
> 	at org.apache.spark.deploy.SparkSubmit$.downloadFile(SparkSubmit.scala:865)
> 	at org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$1.apply(SparkSubmit.scala:316)
> 	at org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$1.apply(SparkSubmit.scala:316)
> 	at scala.Option.map(Option.scala:146)
> 	at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:316)
> 	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:153)
> 	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
> 	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> benmayne@Benjamins-MacBook-Pro ~ $
> {noformat}
> If I replace the path with a valid hdfs path (hdfs:///user/benmayne/valid-jar.jar), it
works as expected. I've seen the same behavior across 2.2.0 (hadoop 2.6 & 2.7 on mac and
ubuntu) and on 2.1.1 on ubuntu. 
> this is the example that I'm trying to replicate from https://spark.apache.org/docs/latest/submitting-applications.html#advanced-dependency-management:

> > Spark uses the following URL scheme to allow different strategies for disseminating
jars:
> > file: - Absolute paths and file:/ URIs are served by the driver’s HTTP file server,
and every executor pulls the file from the driver HTTP server.
> > hdfs:, http:, https:, ftp: - these pull down files and JARs from the URI as expected
> {noformat}
> # Run on a Mesos cluster in cluster deploy mode with supervise
> ./bin/spark-submit \
>   --class org.apache.spark.examples.SparkPi \
>   --master mesos://207.184.161.138:7077 \
>   --deploy-mode cluster \
>   --supervise \
>   --executor-memory 20G \
>   --total-executor-cores 100 \
>   http://path/to/examples.jar \
>   1000
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message