spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sven Krasser (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-5051) python: module pyspark.daemon not found
Date Tue, 27 Jan 2015 18:26:34 GMT

    [ https://issues.apache.org/jira/browse/SPARK-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14293916#comment-14293916
] 

Sven Krasser commented on SPARK-5051:
-------------------------------------

I assume this is related to this thread: http://mail-archives.apache.org/mod_mbox/spark-user/201501.mbox/%3CA4195FBAA1107C459F078E1BCFEA3E303C5DA1D7A2@MAILBOX-HYD.capiqcorp.com%3E

It'd be a good idea to update this ticket to indicate you're getting the error only when submitting
from an external Windows driver. As far as the error goes, the executor logs on both {{nj09mhf0730.mhf.mhc}}
and {{nj09mhf0731.mhf.mhc}} may have some more detail.

> python: module pyspark.daemon not found
> ---------------------------------------
>
>                 Key: SPARK-5051
>                 URL: https://issues.apache.org/jira/browse/SPARK-5051
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 1.2.0
>            Reporter: naveen kumar
>
> Hi,
> I am using spark 1.2 jar. I set up a 2 node spark cluster on unix machines
> Now i am trying to connect to above mentioned cluster and execute the following commands

> lines = sc.textFile("hdfs://master/data/spark/SINGLE.TXT")
> lineLengths = lines.map(lambda s: len(s))
> totalLength = lineLengths.reduce(lambda a, b: a + b)
> It is giving following exception
> Please help me to resolve this issue.
> python: module pyspark.daemon not found
> PYTHONPATH was:
>   /home/npokala/data/spark-install/spark-java-1.6/spark-master/python:/home/npokala/data/spark-install/spark-java-1.6/spark-master/python/lib/py4j-0.8.2.1-src.zip:/home/npokala/data/spark-install/spark-java-1.6/spark-master/assembly/target/scala-2.10/spark-assembly-1.3.0-SNAPSHOT-hadoop2.4.0.jar:/home/npokala/data/spark-install/spark-java-1.6/spark-master/sbin/../python/lib/py4j-0.8.2.1-src.zip:/home/npokala/data/spark-install/spark-java-1.6/spark-master/sbin/../python:
> java.io.EOFException
>         at java.io.DataInputStream.readInt(DataInputStream.java:392)
>         at org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:163)
>         at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:86)
>         at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62)
>         at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:102)
>         at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:265)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:232)
>         at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>         at org.apache.spark.scheduler.Task.run(Task.scala:56)
>         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Driver stacktrace:
>         at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214)
>         at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1203)
>         at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1202)
>         at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>         at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>         at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1202)
>         at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
>         at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
>         at scala.Option.foreach(Option.scala:236)
>         at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:696)
>         at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1420)
>         at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>         at org.apache.spark.scheduler.DAGSchedulerEventProcessActor.aroundReceive(DAGScheduler.scala:1375)
>         at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>         at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>         at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
>         at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>         at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
>         at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>         at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>         at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>         at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message