mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joris Van Remoortere <jo...@mesosphere.io>
Subject Re: Failed to shutdown socket with fd xxx
Date Fri, 17 Jun 2016 13:31:49 GMT
The shutdown errors are not the issue.
The concerning part is this warning:

> W0615 15:01:43.285518  4182 linux_launcher.cpp:197] Couldn't find pid
> '42322' in 'mesos_executors.slice'. This can lead to lack of proper
> resource isolation

That indicates a transition from the old systemd lack of support to the new
support.

—
*Joris Van Remoortere*
Mesosphere

On Fri, Jun 17, 2016 at 2:35 PM, haosdent <haosdent@gmail.com> wrote:

> Hi, @Qiang.
>
> @Joseph have a nice explain about at Shutdown failed on fd
>
> http://search-hadoop.com/m/0Vlr6pe7qb2MJX8B1&subj=Re+Benign+Shutdown+failed+on+fd+error+messages
> Those errors could be ignored.
>
> For
> ```
> I0615 15:01:43.324935  4172 mem.cpp:602] Started listening for OOM events
> for container f50b4c7a-d1d2-4fc8-abb9-5ab549f168dc
> ```
>
> These are normal info log, it happen when Mesos CgroupMemIsolator register
> oom hooks for your containers.
>
> On Fri, Jun 17, 2016 at 8:22 PM, Joris Van Remoortere <joris@mesosphere.io
> >
> wrote:
>
> > Can you provide:
> > 1. The version that you are upgrading from.
> > 2. Whether you made any OS / init system changes alongside this upgrade
> > (just to narrow the scope).
> >
> > It is possible that you are upgrading from a version that did not have
> > systemd support to one that does. If so, the upgrade may require
> restarting
> > the tasks (either by themselves, or just starting a fresh agent). Please
> > check out some of the work in MESOS-3007 to get a better understanding of
> > what the issue I am referring to is.
> >
> > If you can verify that you are making one of these transitions from a bad
> > world to a good world, then you can devise a plan for your upgrade.
> >
> > Joris
> >
> > —
> > *Joris Van Remoortere*
> > Mesosphere
> >
> > On Fri, Jun 17, 2016 at 8:28 AM, Qiang Chen <qzschen@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > I met an issue when upgrading mesos-slave to 0.28.2.
> > >
> > > At the process of recovering mesos-slave / framework container stage,
> it
> > > produced the following errors.
> > >
> > >
> > > ```
> > > Log file created at: 2016/06/15 15:01:43
> > > Running on machine: mesos-slave-online005-xxx.cloud.xxx.domain
> > > Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
> > > W0615 15:01:43.285518  4182 linux_launcher.cpp:197] Couldn't find pid
> > > '42322' in 'mesos_executors.slice'. This can lead to lack of proper
> > > resource isolation
> > > W0615 15:01:43.286182  4182 linux_launcher.cpp:197] Couldn't find pid
> > > '42312' in 'mesos_executors.slice'. This can lead to lack of proper
> > > resource isolation
> > > W0615 15:01:43.286669  4182 linux_launcher.cpp:197] Couldn't find pid
> > > '42309' in 'mesos_executors.slice'. This can lead to lack of proper
> > > resource isolation
> > > W0615 15:01:43.287144  4182 linux_launcher.cpp:197] Couldn't find pid
> > > '42304' in 'mesos_executors.slice'. This can lead to lack of proper
> > > resource isolation
> > > W0615 15:01:43.287636  4182 linux_launcher.cpp:197] Couldn't find pid
> > > '42300' in 'mesos_executors.slice'. This can lead to lack of proper
> > > resource isolation
> > > W0615 15:01:43.288120  4182 linux_launcher.cpp:197] Couldn't find pid
> > > '42317' in 'mesos_executors.slice'. This can lead to lack of proper
> > > resource isolation
> > > E0615 15:01:43.471676  4201 process.cpp:1958] Failed to shutdown socket
> > > with fd 24: Transport endpoint is not connected
> > > E0615 15:01:43.476007  4201 process.cpp:1958] Failed to shutdown socket
> > > with fd 24: Transport endpoint is not connected
> > > E0615 15:01:43.476143  4201 process.cpp:1958] Failed to shutdown socket
> > > with fd 24: Transport endpoint is not connected
> > > E0615 15:01:43.476272  4201 process.cpp:1958] Failed to shutdown socket
> > > with fd 24: Transport endpoint is not connected
> > > E0615 15:01:43.476483  4201 process.cpp:1958] Failed to shutdown socket
> > > with fd 24: Transport endpoint is not connected
> > > E0615 15:01:43.476618  4201 process.cpp:1958] Failed to shutdown socket
> > > with fd 24: Transport endpoint is not connected
> > >
> > > ```
> > >
> > > And it will also cause the OOM errors, such as:
> > >
> > > ```
> > > I0615 15:01:43.324935  4172 mem.cpp:602] Started listening for OOM
> events
> > > for container f50b4c7a-d1d2-4fc8-abb9-5ab549f168dc
> > > I0615 15:01:43.325469 4172 mem.cpp:722] Started listening on low memory
> > > pressure events for container f50b4c7a-d1d2-4fc8-abb9-5ab549f168dc
> > > I0615 15:01:43.326004  4172 mem.cpp:722] Started listening on medium
> > > memory pressure events for container
> f50b4c7a-d1d2-4fc8-abb9-5ab549f168dc
> > > I0615 15:01:43.326539  4172 mem.cpp:722] Started listening on critical
> > > memory pressure events for container
> f50b4c7a-d1d2-4fc8-abb9-5ab549f168dc
> > >
> > > ```
> > >
> > > Did someone suffer this? thanks.
> > >
> > > --
> > > Best Regards,
> > > Chen, Qiang
> > >
> > >
> >
>
>
>
> --
> Best Regards,
> Haosdent Huang
>

Mime
View raw message