spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tian Zhang <tzhang...@yahoo.com>
Subject Spark streaming checkpoint against s3
Date Wed, 14 Oct 2015 20:41:58 GMT
Hi, I am trying to set spark streaming checkpoint to s3, here is what I did
basically

    val checkpoint = "s3://myBucket/checkpoint"
    val ssc = StreamingContext.getOrCreate(checkpointDir,
                                           () =>
getStreamingContext(sparkJobName,
                                                                                   
batchDurationSec),
                                                                                  
classOf[MyClassKryoRegistrator],
                                                                                  
checkpointDir),
                                                                                  
getHadoopConfiguration) 
  
  def getHadoopConfiguration: Configuration = {
    val hadoopConf = new Configuration()
    hadoopConf.set("fs.defaultFS", "s3://"+myBucket+"/")
    hadoopConf.set("fs.s3.awsAccessKeyId", "myAccessKey")
    hadoopConf.set("fs.s3.awsSecretAccessKey", "mySecretKey")
    hadoopConf.set("fs.s3n.awsAccessKeyId", "myAccessKey")
    hadoopConf.set("fs.s3n.awsSecretAccessKey", "mySecretKey
    hadoopConf
   }

It is working as I can see that it tries to retrieve checkpoint from s3. 

However it did more than what I intended.  I saw in the log of the following
15/10/14 19:58:47 ERROR spark.SparkContext: Jar not found at
file:/media/ephemeral0/oncue/mesos-slave/slaves/20151007-172900-436893194-5050-2984-S9/frameworks/20150825-180042-604730890-5050-4268-0003/executors/tian-act-reg.47368a1a-71f9-11e5-ad61-de5fb3a867da/runs/dfc28a6c-48a0-464b-bdb1-d6dd057acd51/artifacts/rna-spark-streaming.jar

Now SparkContext is trying to look the following path instead of local

file:/media/ephemeral0/oncue/mesos-slave/slaves/20151007-172900-436893194-5050-2984-S9/frameworks/20150825-180042-604730890-5050-4268-0003/executors/tian-act-reg.47368a1a-71f9-11e5-ad61-de5fb3a867da/runs/dfc28a6c-48a0-464b-bdb1-d6dd057acd51/artifacts/rna-spark-streaming.jar

How do I let SparkContext to look just
/media/ephemeral0/oncue/mesos-slave/slaves/20151007-172900-436893194-5050-2984-S9/frameworks/20150825-180042-604730890-5050-4268-0003/executors/tian-act-reg.47368a1a-71f9-11e5-ad61-de5fb3a867da/runs/dfc28a6c-48a0-464b-bdb1-d6dd057acd51/artifacts/rna-spark-streaming.jar?






--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-streaming-checkpoint-against-s3-tp25068.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message