mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Crawford (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MESOS-8634) master.cpp:7141] Check failed: 'framework' Must be non NULL
Date Sun, 04 Mar 2018 21:03:00 GMT
Jack Crawford created MESOS-8634:
------------------------------------

             Summary: master.cpp:7141] Check failed: 'framework' Must be non NULL
                 Key: MESOS-8634
                 URL: https://issues.apache.org/jira/browse/MESOS-8634
             Project: Mesos
          Issue Type: Choose from below ...
            Reporter: Jack Crawford


While investigating agents that seemed to occasionally fail to connect to my master, I found
this crash. It seems to indicate that framework must be non-null, which is a strange condition
to fail on after completing hundreds of tasks from the same scheduler prior to this failure.

 

```

...

I0304 19:46:13.884625 2477 master.cpp:9062] Removing executor '3bcbc87a-6a3a-4d93-90e9-5ca8e3be9581'
with resources [\{"allocation_info":{"role":"*"},"name":"cpus","scalar":\{"value":0.1},"type":"SCALAR"}
I0304 19:46:13.884871 2482 hierarchical.cpp:412] Deactivated framework 10ef6158-8e97-4f1c-83a4-8fd6c6d7b582-0008
I0304 19:46:13.884871 2477 master.cpp:9062] Removing executor '3bcbc87a-6a3a-4d93-90e9-5ca8e3be9581'
with resources [\{"allocation_info":{"role":"*"},"name":"cpus","scalar":\{"value":0.1},"type":"SCALAR"}
I0304 19:46:13.885016 2477 master.cpp:9062] Removing executor '3bcbc87a-6a3a-4d93-90e9-5ca8e3be9581'
with resources [\{"allocation_info":{"role":"*"},"name":"cpus","scalar":\{"value":0.1},"type":"SCALAR"}
I0304 19:46:13.885159 2477 master.cpp:9062] Removing executor '3bcbc87a-6a3a-4d93-90e9-5ca8e3be9581'
with resources [\{"allocation_info":{"role":"*"},"name":"cpus","scalar":\{"value":0.1},"type":"SCALAR"}
I0304 19:46:13.885587 2479 hierarchical.cpp:355] Removed framework 10ef6158-8e97-4f1c-83a4-8fd6c6d7b582-0008
W0304 19:46:20.119781 2478 master.cpp:6969] Ignoring unknown exited executor '3bcbc87a-6a3a-4d93-90e9-5ca8e3be9581'
of framework 10ef6158-8e97-4f1c-83a4-8fd6c6d7b582-0008 on agent 10ef6158-8e97-4f1c-83a
I0304 19:46:21.559152 2481 http.cpp:1185] HTTP GET for /master/state?jsonp=angular.callbacks._ccd
from 10.142.0.5:57441 with User-Agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:59.0)
Gecko/2010
W0304 19:46:22.144783 2484 master.cpp:6969] Ignoring unknown exited executor '3bcbc87a-6a3a-4d93-90e9-5ca8e3be9581'
of framework 10ef6158-8e97-4f1c-83a4-8fd6c6d7b582-0008 on agent 10ef6158-8e97-4f1c-83a
W0304 19:46:22.184733 2478 master.cpp:6969] Ignoring unknown exited executor '3bcbc87a-6a3a-4d93-90e9-5ca8e3be9581'
of framework 10ef6158-8e97-4f1c-83a4-8fd6c6d7b582-0008 on agent 10ef6158-8e97-4f1c-83a
W0304 19:46:22.233067 2477 master.cpp:6969] Ignoring unknown exited executor '3bcbc87a-6a3a-4d93-90e9-5ca8e3be9581'
of framework 10ef6158-8e97-4f1c-83a4-8fd6c6d7b582-0008 on agent 10ef6158-8e97-4f1c-83a
I0304 19:46:31.169176 2481 master.cpp:7073] Marking agent 10ef6158-8e97-4f1c-83a4-8fd6c6d7b582-S6
at slave(1)@10.142.0.10:5051 (tf-mesos-agent-zkw9.c.bitcoin-engine.internal) unreachable:
health check t
I0304 19:46:31.169500 2481 registrar.cpp:495] Applied 1 operations in 160333ns; attempting
to update the registry
I0304 19:46:31.169723 2481 coordinator.cpp:348] Coordinator attempting to write APPEND action
at position 5321
I0304 19:46:31.169872 2481 replica.cpp:540] Replica received write request for position 5321
from __req_res__(48)@10.142.0.5:5050
I0304 19:46:31.172544 2483 replica.cpp:694] Replica received learned notice for position 5321
from log-network(1)@10.142.0.5:5050
I0304 19:46:31.174551 2478 registrar.cpp:552] Successfully updated the registry in 4.972032ms
I0304 19:46:31.174674 2478 master.cpp:7121] Marked agent 10ef6158-8e97-4f1c-83a4-8fd6c6d7b582-S6
at slave(1)@10.142.0.10:5051 (tf-mesos-agent-zkw9.c.bitcoin-engine.internal) unreachable:
health check ti
I0304 19:46:31.174679 2484 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action
at position 5322
F0304 19:46:31.174721 2478 master.cpp:7141] Check failed: 'framework' Must be non NULL
*** Check failure stack trace: ***
I0304 19:46:31.174834 2484 hierarchical.cpp:626] Removed agent 10ef6158-8e97-4f1c-83a4-8fd6c6d7b582-S6
I0304 19:46:31.174934 2480 replica.cpp:540] Replica received write request for position 5322
from __req_res__(49)@10.142.0.5:5050
@ 0x7f6d74aff73d google::LogMessage::Fail()
I0304 19:46:31.176913 2477 replica.cpp:694] Replica received learned notice for position 5322
from log-network(1)@10.142.0.5:5050
@ 0x7f6d74b013bd google::LogMessage::SendToLog()
@ 0x7f6d74aff302 google::LogMessage::Flush()
@ 0x7f6d74b01da9 google::LogMessageFatal::~LogMessageFatal()
@ 0x7f6d73e6cef6 google::CheckNotNull<>()
@ 0x7f6d73e55da3 mesos::internal::master::Master::_markUnreachable()
@ 0x7f6d74a61e22 process::ProcessManager::resume()
@ 0x7f6d74a67d46 _ZNSt6thread5_ImplISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUlvE_vEEE6_M_runEv
@ 0x7f6d72a7d970 (unknown)
@ 0x7f6d7229b064 start_thread
@ 0x7f6d71fd062d (unknown)
mesos-master.service: main process exited, code=killed, status=6/ABRT
Unit mesos-master.service entered failed state.

```



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message