spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mmuru <mmur...@gmail.com>
Subject PySpark structured streaming job throws socket exception
Date Thu, 04 Oct 2018 20:59:39 GMT
Hi,

Running Pyspark structured streaming job on K8S with 2 executor pods. The
driver pod failed with the following up exception. It fails consistently
after 3 to 6hrs of running.  

Any idea how to fix this exception. I really appreciate your help. 


2018-10-04 18:48:27 ERROR DAGScheduler:91 - Failed to update accumulators
for task 21
java.net.SocketException: Connection reset
	at java.net.SocketInputStream.read(SocketInputStream.java:210)
	at java.net.SocketInputStream.read(SocketInputStream.java:141)
	at java.net.SocketInputStream.read(SocketInputStream.java:224)
	at
org.apache.spark.api.python.PythonAccumulatorV2.merge(PythonRDD.scala:659)
	at
org.apache.spark.scheduler.DAGScheduler$$anonfun$updateAccumulators$1.apply(DAGScheduler.scala:1257)
	at
org.apache.spark.scheduler.DAGScheduler$$anonfun$updateAccumulators$1.apply(DAGScheduler.scala:1249)
	at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
	at
org.apache.spark.scheduler.DAGScheduler.updateAccumulators(DAGScheduler.scala:1249)
	at
org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1331)
	at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2100)
	at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2052)
	at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2041)
	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
2018-10-04 18:48:27 ERROR DAGScheduler:91 - Failed to update accumulators
for task 22
java.net.SocketException: Broken pipe (Write failed)
	at java.net.SocketOutputStream.socketWrite0(Native Method)
	at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
	at java.net.SocketOutputStream.write(SocketOutputStream.java:155)
	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
	at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
	at java.io.DataOutputStream.flush(DataOutputStream.java:123)
	at
org.apache.spark.api.python.PythonAccumulatorV2.merge(PythonRDD.scala:657)
	at
org.apache.spark.scheduler.DAGScheduler$$anonfun$updateAccumulators$1.apply(DAGScheduler.scala:1257)
	at
org.apache.spark.scheduler.DAGScheduler$$anonfun$updateAccumulators$1.apply(DAGScheduler.scala:1249)
	at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
	at
org.apache.spark.scheduler.DAGScheduler.updateAccumulators(DAGScheduler.scala:1249)
	at
org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1331)
	at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2100)
	at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2052)
	at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2041)
	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message