mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Ketchum <cketc...@ucsc.edu>
Subject CPU soft lock up on mesos-slave
Date Mon, 31 Aug 2015 22:54:46 GMT
Hi all,

I was running a Mesos cluster on EC2 with c4.8xlarge instance types when
one of the status checks failed. We are running Mesos 0.22.1 on ubuntu
14.04, with kernel version 3.13.0-55-generic. EC2 gave us this console
output[1]. I did some searching and found similar issues reported here[2]
on lkml, though those logs indicated a specific task and an older kernel,
while these logs just show mesos-slave as the causative process.

Unfortunately, the instance was terminated so I'm not sure how much useful
debugging can be done. Is this a known issue? We are also using a our own
python executor, could an error there have caused this?

[1] http://pastebin.com/NgHi8MnS
[2] https://lkml.org/lkml/2014/9/30/498

Thanks,
Chris

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message