mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Mahler <benjamin.mah...@gmail.com>
Subject Re: Slaves deactivating
Date Wed, 05 Jun 2013 17:46:52 GMT
Hey Brenden, can you provide more of the slave log if you still have it?
It's likely something was causing the slave to hang so it would be useful
to see what happened prior to the first log line you posted. We're
investigating an issue at Twitter where slaves can hang for 30 minutes - a
few hours, likely related to the cgroups freezer.


On Thu, May 30, 2013 at 11:50 AM, Brenden Matthews <
brenden.matthews@airbedandbreakfast.com> wrote:

> I agree with you that slaves which fail health checks should be removed.  I
> suspect this is just a matter of tuning, and perhaps an issue related to
> EC2.  I'll try increasing the value and see if that helps for now.
>
> I also found mis-configured filesystem, so perhaps Mesos is not the culprit
> here :)
>
> On Thu, May 30, 2013 at 11:30 AM, Vinod Kone <vinodkone@gmail.com> wrote:
>
> > Hmm. I'm not sure I agree. If a slave is not responding to health checks,
> > that seems bad to me. A framework would be well off, if the slave is
> > shutdown so that it can launch its tasks elsewhere in the cluster.
> >
> > The current parameters (SLAVE_PING_TIMEOUT, MAX_SLAVE_PING_TIMEOUTS) are
> > such that a slave not responding to health checks for 75s is shutdown by
> > the master. That seems reasonable to me? If you want that to be tunable,
> > however, we can expose them vial masters flags.
> >
> > Having said that, the underlying problem that we need to diagnose/fix is
> to
> > ensure the slave is responsive.
> >
> >
> > On Thu, May 30, 2013 at 11:01 AM, Brenden Matthews <
> > brenden.matthews@airbedandbreakfast.com> wrote:
> >
> > > The slave is running a Hadoop task and is probably under heavy load.  I
> > > think it's normal for it to occasionally respond slowly to health
> checks,
> > > and Mesos shouldn't be trying to kill it because of this.  I'm not too
> > > concerned about the kill failing, I'm more concerned with the fact that
> > the
> > > process is being erroneously killed in the first place.
> > >
> > >
> > > On Thu, May 30, 2013 at 10:55 AM, Vinod Kone <vinodkone@gmail.com>
> > wrote:
> > >
> > > > Sounds like the slave is not responding to health checks by the
> master.
> > > > Does this happen right after you start the slave or after a while?
> Are
> > > you
> > > > able to get system load graph during this time?
> > > >
> > > > Also, the check failure while cleaning cgroups is clearly a bug
> (likely
> > > > related to MESOS-461 <
> https://issues.apache.org/jira/browse/MESOS-461
> > >).
> > > >
> > > >
> > > > On Thu, May 30, 2013 at 10:43 AM, Brenden Matthews <
> > > > brenden.matthews@airbedandbreakfast.com> wrote:
> > > >
> > > > > Hey guys,
> > > > >
> > > > > I'm having a frequent problem right now in master.  Slaves keep
> > > > > deactivating and I'm unsure why.  Here's the master log:
> > > > >
> > > > > W0530 17:35:32.300029 21798 master.cpp:1199] Removing slave
> > > > > 201305300057-1471680778-5050-21299-144 at
> > > > > slave(1)@10.148.178.186:5051because it has been deactivated
> > > > > I0530 17:35:32.300742 21800 hierarchical_allocator_process.hpp:423]
> > > > Removed
> > > > > slave 201305300057-1471680778-5050-21299-144
> > > > > I0530 17:35:32.302295 21798 master.hpp:295] Removing task
> > > > Task_Tracker_475
> > > > > with resources cpus=23.25; mem=51150; disk=126976;
> > ports=[31001-31001,
> > > > > 31999-31999] on slave 201305300057-1471680778-5050-21299-144
> > > > > I0530 17:35:32.304235 21798 master.hpp:295] Removing task
> > > > > ct:join_search_request_yesterday:1369931394916:2 with resources
> > cpus=1;
> > > > > mem=1; disk=1 on slave 201305300057-1471680778-5050-21299-144
> > > > > I0530 17:35:32.306157 21798 master.hpp:295] Removing task
> > > > Task_Tracker_451
> > > > > with resources cpus=2.25; mem=4950; disk=12288; ports=[31000-31000,
> > > > > 32000-32000] on slave 201305300057-1471680778-5050-21299-144
> > > > >
> > > > >
> > > > > And here's the slave log:
> > > > >
> > > > > I0530 17:29:31.326658 24787 slave.cpp:2498] Current usage 0.66%.
> Max
> > > > > allowed age: 6.253510014347754days
> > > > > I0530 17:30:31.328030 24787 slave.cpp:2498] Current usage 0.67%.
> Max
> > > > > allowed age: 6.253270522087396days
> > > > > I0530 17:31:31.329982 24798 slave.cpp:2498] Current usage 1.44%.
> Max
> > > > > allowed age: 6.199047245757789days
> > > > > I0530 17:32:31.333297 24810 slave.cpp:2498] Current usage 1.55%.
> Max
> > > > > allowed age: 6.191838835555544days
> > > > > I0530 17:34:12.236701 24790 slave.cpp:2498] Current usage 1.59%.
> Max
> > > > > allowed age: 6.188834669188438days
> > > > > I0530 17:37:10.797281 24790 slave.cpp:492] Slave asked to shut down
> > by
> > > > > master@10.17.184.87:5050
> > > > > I0530 17:37:10.938212 24790 slave.cpp:1114] Asked to shut down
> > > framework
> > > > > 201305290115-1471680778-5050-30247-0001 by
> master@10.17.184.87:5050
> > > > > I0530 17:37:10.954860 24790 slave.cpp:1139] Shutting down framework
> > > > > 201305290115-1471680778-5050-30247-0001
> > > > > I0530 17:37:10.955477 24790 slave.cpp:2315] Shutting down executor
> > > > > 'executor_Task_Tracker_475' of framework
> > > > > 201305290115-1471680778-5050-30247-0001
> > > > > I0530 17:37:10.956305 24790 slave.cpp:2315] Shutting down executor
> > > > > 'executor_Task_Tracker_451' of framework
> > > > > 201305290115-1471680778-5050-30247-0001
> > > > > I0530 17:37:10.956914 24790 slave.cpp:1114] Asked to shut down
> > > framework
> > > > > chronos by master@10.17.184.87:5050
> > > > > I0530 17:37:10.992754 24790 slave.cpp:1139] Shutting down framework
> > > > chronos
> > > > > I0530 17:37:11.028842 24790 slave.cpp:2315] Shutting down executor
> > > > > 'ct:join_search_request_yesterday:1369931394916:2' of framework
> > chronos
> > > > > I0530 17:37:14.847417 24809 cgroups_isolator.cpp:804] Executor
> > > > > ct:join_search_request_yesterday:1369931394916:2 of framework
> chronos
> > > > > terminated with status 0
> > > > > I0530 17:37:15.957542 24792 slave.cpp:2384] Killing executor
> > > > > 'executor_Task_Tracker_475' of framework
> > > > > 201305290115-1471680778-5050-30247-0001
> > > > > I0530 17:37:16.048948 24809 cgroups_isolator.cpp:620] Killing
> > executor
> > > > > ct:join_search_request_yesterday:1369931394916:2 of framework
> chronos
> > > > > I0530 17:37:16.085058 24792 slave.cpp:2384] Killing executor
> > > > > 'executor_Task_Tracker_451' of framework
> > > > > 201305290115-1471680778-5050-30247-0001
> > > > > I0530 17:37:16.086105 24792 slave.cpp:2384] Killing executor
> > > > > 'ct:join_search_request_yesterday:1369931394916:2' of framework
> > chronos
> > > > > W0530 17:37:16.089593 24790 monitor.cpp:167] Failed to collect
> > resource
> > > > > usage for executor
> 'ct:join_search_request_yesterday:1369931394916:2'
> > > of
> > > > > framework 'chronos': Unknown or killed executor
> > > > > I0530 17:37:16.089761 24809 cgroups_isolator.cpp:620] Killing
> > executor
> > > > > executor_Task_Tracker_475 of framework
> > > > > 201305290115-1471680778-5050-30247-0001
> > > > > I0530 17:37:16.093415 24810 cgroups.cpp:1175] Trying to freeze
> cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330
> > > > > I0530 17:37:16.161749 24810 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:16.163240 24809 cgroups_isolator.cpp:1023] OOM notifier
> > is
> > > > > triggered for executor
> > ct:join_search_request_yesterday:1369931394916:2
> > > > of
> > > > > framework chronos with uuid 199c6567-1296-427f-a74e-13742b95a330
> > > > > I0530 17:37:16.163627 24809 cgroups_isolator.cpp:1028] Discarded
> OOM
> > > > > notifier for executor
> > ct:join_search_request_yesterday:1369931394916:2
> > > of
> > > > > framework chronos with uuid 199c6567-1296-427f-a74e-13742b95a330
> > > > > I0530 17:37:16.164384 24809 cgroups_isolator.cpp:620] Killing
> > executor
> > > > > executor_Task_Tracker_451 of framework
> > > > > 201305290115-1471680778-5050-30247-0001
> > > > > E0530 17:37:16.166890 24809 cgroups_isolator.cpp:616] Asked to kill
> > an
> > > > > unknown/killed executor!
> > > > > I0530 17:37:16.241206 24809 cgroups_isolator.cpp:1023] OOM notifier
> > is
> > > > > triggered for executor executor_Task_Tracker_475 of framework
> > > > > 201305290115-1471680778-5050-30247-0001 with uuid
> > > > > e90a8ce9-812d-4757-833e-62c55ada5cda
> > > > > I0530 17:37:16.264039 24786 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:16.290067 24809 cgroups_isolator.cpp:1028] Discarded
> OOM
> > > > > notifier for executor executor_Task_Tracker_475 of framework
> > > > > 201305290115-1471680778-5050-30247-0001 with uuid
> > > > > e90a8ce9-812d-4757-833e-62c55ada5cda
> > > > > I0530 17:37:16.313125 24809 cgroups_isolator.cpp:1023] OOM notifier
> > is
> > > > > triggered for executor executor_Task_Tracker_451 of framework
> > > > > 201305290115-1471680778-5050-30247-0001 with uuid
> > > > > 10c5125d-cca4-42b7-a11a-a7fda594b005
> > > > > I0530 17:37:16.321915 24809 cgroups_isolator.cpp:1028] Discarded
> OOM
> > > > > notifier for executor executor_Task_Tracker_451 of framework
> > > > > 201305290115-1471680778-5050-30247-0001 with uuid
> > > > > 10c5125d-cca4-42b7-a11a-a7fda594b005
> > > > > F0530 17:37:16.322549 24809 cgroups_isolator.cpp:1165] Failed to
> > > destroy
> > > > > cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> mesos/framework_201305290115-1471680778-5050-30247-0001_executor_executor_Task_Tracker_475_tag_e90a8ce9-812d-4757-833e-62c55ada5cda:
> > > > > Failed to kill tasks in nested cgroups: Collect failed: Failed to
> > send
> > > > > Killed to process 721: No such process
> > > > > *** Check failure stack trace: ***
> > > > > I0530 17:37:16.405956 24802 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:18.258474 24787 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:18.394127 24781 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:18.530743 24785 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:18.670518 24788 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:18.806540 24803 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:18.942659 24796 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:19.078434 24807 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:19.214282 24789 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:19.350401 24793 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:19.486559 24800 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:19.622144 24799 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:19.758436 24806 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:19.894443 24792 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:19.996562 24797 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:20.134127 24783 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:20.270037 24804 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:20.406051 24805 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:20.542073 24808 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:20.678064 24794 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:20.814045 24798 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:20.950011 24810 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:21.052018 24780 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:21.186023 24786 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:21.322306 24781 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > >     @     0x7f57595b5c1d  google::LogMessage::Fail()
> > > > > I0530 17:37:21.424214 24787 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > >     @     0x7f57595b83af  google::LogMessage::SendToLog()
> > > > > I0530 17:37:21.526203 24790 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:21.662224 24785 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:21.764526 24788 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:21.866744 24802 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:21.969965 24780 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:22.072257 24791 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:22.210059 24811 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:22.346051 24782 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > >     @     0x7f57595b581b  google::LogMessage::Flush()
> > > > > I0530 17:37:22.482100 24795 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:22.618103 24784 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:22.754070 24786 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:22.856056 24781 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > >     @     0x7f57595b8c3d
>  google::LogMessageFatal::~LogMessageFatal()
> > > > > I0530 17:37:22.958155 24790 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:23.094172 24785 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > >     @     0x7f575933a42b
> > > > >  mesos::internal::slave::CgroupsIsolator::_killExecutor()
> > > > > I0530 17:37:23.196523 24801 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:23.334296 24787 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:23.436355 24788 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:23.538429 24796 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:23.640461 24789 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:23.742709 24800 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:23.844823 24803 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:23.946893 24799 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > I0530 17:37:24.050462 24807 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > >     @     0x7f5759355ff9  std::tr1::_Mem_fn<>::operator()()
> > > > > I0530 17:37:24.152722 24806 cgroups.cpp:1205] Watching cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330until
> > > > > frozen
> > > > > W0530 17:37:24.154515 24806 cgroups.cpp:1263] Unable to freeze
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330
> > > > > within 51 attempts
> > > > > I0530 17:37:24.159581 24794 cgroups.cpp:1190] Trying to thaw cgroup
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330
> > > > > I0530 17:37:24.160049 24794 cgroups.cpp:1300] Successfully thawed
> > > > >
> > > > >
> > > >
> > >
> >
> /cgroup/mesos/framework_chronos_executor_ct:join_search_request_yesterday:1369931394916:2_tag_199c6567-1296-427f-a74e-13742b95a330
> > > > >     @     0x7f5759354a60
> > > > >
> > > > >
> > > >
> > >
> >
>  _ZNSt3tr15_BindIFNS_7_Mem_fnIMN5mesos8internal5slave15CgroupsIsolatorEFvPNS5_10CgroupInfoERKN7process6FutureIbEEEEENS_12_PlaceholderILi1EEES7_SA_EE6__callIIRPS5_EILi0ELi1ELi2EEEENS_9result_ofIFSF_NSN_IFNS_3_MuISH_Lb0ELb1EEESH_NS_5tupleIIDpT_EEEEE4typeENSN_IFNSO_IS7_Lb0ELb0EEES7_ST_EE4typeENSN_IFNSO_ISA_Lb0ELb0EEESA_ST_EE4typeEEE4typeERKST_NS_12_Index_tupleIIXspT0_EEEE
> > > > >     @     0x7f57593514d0
> > > > >
> > > > >
> > > >
> > >
> >
>  _ZNSt3tr15_BindIFNS_7_Mem_fnIMN5mesos8internal5slave15CgroupsIsolatorEFvPNS5_10CgroupInfoERKN7process6FutureIbEEEEENS_12_PlaceholderILi1EEES7_SA_EEclIIPS5_EEENS_9result_ofIFSF_NSM_IFNS_3_MuISH_Lb0ELb1EEESH_NS_5tupleIIDpT_EEEEE4typeENSM_IFNSN_IS7_Lb0ELb0EEES7_SS_EE4typeENSM_IFNSN_ISA_Lb0ELb0EEESA_SS_EE4typeEEE4typeEDpRSQ_
> > > > >     @     0x7f575934cd38
>  std::tr1::_Function_handler<>::_M_invoke()
> > > > >     @     0x7f575934ced9  std::tr1::function<>::operator()()
> > > > >     @     0x7f5759347f31  process::internal::vdispatcher<>()
> > > > >     @     0x7f5759354b7d
> > > > >
> > > > >
> > > >
> > >
> >
>  _ZNSt3tr15_BindIFPFvPN7process11ProcessBaseENS_10shared_ptrINS_8functionIFvPN5mesos8internal5slave15CgroupsIsolatorEEEEEEENS_12_PlaceholderILi1EEESD_EE6__callIIRS3_EILi0ELi1EEEENS_9result_ofIFSF_NSM_IFNS_3_MuISH_Lb0ELb1EEESH_NS_5tupleIIDpT_EEEEE4typeENSM_IFNSN_ISD_Lb0ELb0EEESD_SS_EE4typeEEE4typeERKSS_NS_12_Index_tupleIIXspT0_EEEE
> > > > >     @     0x7f5759351770
> > > > >
> > > > >
> > > >
> > >
> >
>  _ZNSt3tr15_BindIFPFvPN7process11ProcessBaseENS_10shared_ptrINS_8functionIFvPN5mesos8internal5slave15CgroupsIsolatorEEEEEEENS_12_PlaceholderILi1EEESD_EEclIIS3_EEENS_9result_ofIFSF_NSL_IFNS_3_MuISH_Lb0ELb1EEESH_NS_5tupleIIDpT_EEEEE4typeENSL_IFNSM_ISD_Lb0ELb0EEESD_SR_EE4typeEEE4typeEDpRSP_
> > > > >     @     0x7f575934cfc4
>  std::tr1::_Function_handler<>::_M_invoke()
> > > > >     @     0x7f57594a1a7b  std::tr1::function<>::operator()()
> > > > >     @     0x7f575948a55f  process::ProcessBase::visit()
> > > > >     @     0x7f5759490992  process::DispatchEvent::visit()
> > > > >     @     0x7f57590761d4  process::ProcessBase::serve()
> > > > >     @     0x7f5759487cd1  process::ProcessManager::resume()
> > > > >     @     0x7f575947ee2e  process::schedule()
> > > > >     @     0x7f5757bc6e9a  start_thread
> > > > >     @     0x7f57578f3ccd  (unknown)
> > > > >
> > > > >
> > > > > Can you provide me some hints as to what's happening here?  This
is
> > > > > currently a major blocker for me!
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Brenden
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message