spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sridhar Rana (JIRA)" <>
Subject [jira] [Created] (SPARK-19920) How to capture reasons (log/trace/info/anything?) for "[ERROR]Driver disassociated! Shutting down."
Date Sat, 11 Mar 2017 15:57:04 GMT
Sridhar Rana created SPARK-19920:

             Summary: How to capture reasons (log/trace/info/anything?) for "[ERROR]Driver disassociated! Shutting down."
                 Key: SPARK-19920
             Project: Spark
          Issue Type: Question
          Components: Spark Submit, YARN
    Affects Versions: 2.1.0
            Reporter: Sridhar Rana
            Priority: Critical

We have an AWS Cloudera Spark environment. It is a yarn cluster with 1 driver node and 3 executor
node. We use Spark SQL heavily and log4J for logging. Ours is a 24X7 long running process
in a iterative loop. The process runs fine but after several iterations (after few hours),
it reports this error "[ERROR]Driver disassociated! Shutting down.". At
the sames second, there is this warning "[WARN ]Error sending message [message = Heartbeat(1,[Lscala.Tuple2;@24452d3d,BlockManagerId(1,
ip-172-31-21-121.ec2.internal, 40378))] in 1 attempts". The spark process is able to recover
from this failure but takes more time to finish that iteration. Other than that there is not
much info on this. How do we know what is the cause of this error condition or what's causing
it so that appropriate measure can be taken? Can we capture those using log4j?

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message