spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Davies Liu <dav...@databricks.com>
Subject Re: How to submit Pyspark job in mesos?
Date Wed, 30 Jul 2014 05:04:34 GMT
On Tue, Jul 29, 2014 at 6:42 PM, daijia <jia_dai@intsig.com> wrote:
> Dear all,
>
>        I have spark1.0.0 and mesos0.18.1. After setting in mesos and spark
> and starting the mesos cluster, I try to run the pyspark job by the command
> below:
>
>        spark-submit /path/to/my_pyspark_job.py  --master
> mesos://192.168.0.21:5050
>
>        It occurs error below:
>
> 14/07/29 18:40:49 INFO server.Server: jetty-8.y.z-SNAPSHOT
> 14/07/29 18:40:49 INFO server.AbstractConnector: Started
> SelectChannelConnector@0.0.0.0:4041
> 14/07/29 18:40:49 INFO ui.SparkUI: Started SparkUI at http://CentOS-19:4041
> 14/07/29 18:40:49 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 14/07/29 18:40:50 INFO scheduler.EventLoggingListener: Logging events to
> /tmp/spark-events/my_test.py-1406630449771
> 14/07/29 18:40:50 INFO util.Utils: Copying
> /home/daijia/deal_three_word/my_test.py to
> /tmp/spark-4365b01d-b57a-4abb-b39c-cb57b83a28ce/my_test.py
> 14/07/29 18:40:50 INFO spark.SparkContext: Added file
> file:/home/daijia/deal_three_word/my_test.py at
> http://192.168.3.91:51188/files/my_test.py with timestamp 1406630450333
> I0729 18:40:50.440551 15033 sched.cpp:121] Version: 0.18.1
> I0729 18:40:50.442450 15035 sched.cpp:217] New master detected at
> master@192.168.3.91:5050
> I0729 18:40:50.442570 15035 sched.cpp:225] No credentials provided.
> Attempting to register without authentication
> I0729 18:40:50.443234 15036 sched.cpp:391] Framework registered with
> 20140729-174911-1526966464-5050-13758-0006
> 14/07/29 18:40:50 INFO mesos.CoarseMesosSchedulerBackend: Registered as
> framework ID 20140729-174911-1526966464-5050-13758-0006
> 14/07/29 18:40:50 INFO spark.SparkContext: Starting job: count at
> /home/daijia/deal_three_word/my_test.py:27
> 14/07/29 18:40:50 INFO mesos.CoarseMesosSchedulerBackend: Mesos task 0 is
> now TASK_LOST
> 14/07/29 18:40:50 INFO mesos.CoarseMesosSchedulerBackend: Mesos task 1 is
> now TASK_LOST
> 14/07/29 18:40:50 INFO mesos.CoarseMesosSchedulerBackend: Mesos task 3 is
> now TASK_LOST
> 14/07/29 18:40:50 INFO mesos.CoarseMesosSchedulerBackend: Blacklisting Mesos
> slave value: "20140729-163345-1526966464-5050-10913-0"
>  due to too many failures; is Spark installed on it?

The Spark Executor can not start on mesos slaves, you can check the
logs on mesos slaves.

Maybe you forgot to install Spark on slaves?


> 14/07/29 18:40:50 INFO mesos.CoarseMesosSchedulerBackend: Mesos task 2 is
> now TASK_LOST
> 14/07/29 18:40:50 INFO mesos.CoarseMesosSchedulerBackend: Blacklisting Mesos
> slave value: "20140729-163345-1526966464-5050-10913-2"
>  due to too many failures; is Spark installed on it?
> 14/07/29 18:40:50 INFO scheduler.DAGScheduler: Got job 0 (count at
> /home/daijia/deal_three_word/my_test.py:27) with 2 output partitions
> (allowLocal=false)
> 14/07/29 18:40:50 INFO scheduler.DAGScheduler: Final stage: Stage 0(count at
> /home/daijia/deal_three_word/my_test.py:27)
> 14/07/29 18:40:50 INFO scheduler.DAGScheduler: Parents of final stage:
> List()
> 14/07/29 18:40:50 INFO scheduler.DAGScheduler: Missing parents: List()
> 14/07/29 18:40:50 INFO mesos.CoarseMesosSchedulerBackend: Mesos task 4 is
> now TASK_LOST
> 14/07/29 18:40:50 INFO scheduler.DAGScheduler: Submitting Stage 0
> (PythonRDD[1] at RDD at PythonRDD.scala:37), which has no missing parents
> 14/07/29 18:40:50 INFO mesos.CoarseMesosSchedulerBackend: Mesos task 5 is
> now TASK_LOST
> 14/07/29 18:40:50 INFO mesos.CoarseMesosSchedulerBackend: Blacklisting Mesos
> slave value: "20140729-163345-1526966464-5050-10913-1"
>  due to too many failures; is Spark installed on it?
> 14/07/29 18:40:50 INFO scheduler.DAGScheduler: Submitting 2 missing tasks
> from Stage 0 (PythonRDD[1] at RDD at PythonRDD.scala:37)
> 14/07/29 18:40:50 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 with
> 2 tasks
> 14/07/29 18:41:05 WARN scheduler.TaskSchedulerImpl: Initial job has not
> accepted any resources; check your cluster UI to ensure that workers are
> registered and have sufficient memory
> 14/07/29 18:41:20 WARN scheduler.TaskSchedulerImpl: Initial job has not
> accepted any resources; check your cluster UI to ensure that workers are
> registered and have sufficient memory
> 14/07/29 18:41:20 WARN scheduler.TaskSchedulerImpl: Initial job has not
> accepted any resources; check your cluster UI to ensure that workers are
> registered and have sufficient memory
>
>      It just repeats the last message.
>      Here is my python scirpt:
>
> #!/usr/bin/env python
> #coding=utf-8
> from pyspark import SparkContext
> sc = SparkContext()
> temp = []
> for index in range(1000):
>     temp.append(index)
> sc.parallelize(temp).count()
>
>
>         So, the running command is right? Or some other reasons lead to the
> problem.
>
> Thanks in advance,
> Daijia
>
>
>
>
>
>
>
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-submit-Pyspark-job-in-mesos-tp10905.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.

Mime
View raw message