mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Guangya Liu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MESOS-1826) Improve logging for when master cannot connect to slaves
Date Mon, 02 Nov 2015 13:26:27 GMT

    [ https://issues.apache.org/jira/browse/MESOS-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14985204#comment-14985204
] 

Guangya Liu commented on MESOS-1826:
------------------------------------

Thanks [~adam-mesos] after some test with the steps you provided, I think that the log message
in master is now very clear and the end user can know what is wrong with his slave from master
log. What do you say? Thanks!

{code}
I1102 21:21:18.333128 26844 replica.cpp:512] Replica received write request for position 1836
from (8)@192.168.0.101:5050
E1102 21:21:18.335482 26851 process.cpp:1911] Failed to shutdown socket with fd 11: Transport
endpoint is not connected
I1102 21:21:18.336086 26845 hierarchical.cpp:335] Added slave 0ad8ede6-9627-4a95-a5c9-d7a21c1ac4c8-S0
(localhost) with cpus(*):1; mem(*):623; disk(*):9618; ports(*):[31000-32000] (allocated: )
I1102 21:21:18.337069 26846 master.cpp:3921] Registered slave 0ad8ede6-9627-4a95-a5c9-d7a21c1ac4c8-S0
at slave(1)@127.0.0.1:5051 (localhost) with cpus(*):1; mem(*):623; disk(*):9618; ports(*):[31000-32000]
I1102 21:21:18.337333 26846 master.cpp:1077] Slave 0ad8ede6-9627-4a95-a5c9-d7a21c1ac4c8-S0
at slave(1)@127.0.0.1:5051 (localhost) disconnected
I1102 21:21:18.337376 26846 master.cpp:2525] Disconnecting slave 0ad8ede6-9627-4a95-a5c9-d7a21c1ac4c8-S0
at slave(1)@127.0.0.1:5051 (localhost)
I1102 21:21:18.337473 26846 master.cpp:2544] Deactivating slave 0ad8ede6-9627-4a95-a5c9-d7a21c1ac4c8-S0
at slave(1)@127.0.0.1:5051 (localhost)
{code}

> Improve logging for when master cannot connect to slaves
> --------------------------------------------------------
>
>                 Key: MESOS-1826
>                 URL: https://issues.apache.org/jira/browse/MESOS-1826
>             Project: Mesos
>          Issue Type: Improvement
>    Affects Versions: 0.20.0
>            Reporter: Thomas Rampelberg
>            Assignee: Guangya Liu
>            Priority: Minor
>              Labels: newbie
>
> When first setting a mesos cluster up, it is possible to get into a state where your
slaves are constantly re-registering. This happens because the slave pid is not reachable
from the master.
> Currently, the master logs make it pretty tough to figure out that this is the problem
that is occurring. It would be fantastic if there was a better explanation in the logs, something
like:
>     Unable to connect to slave X at x.x.x.x:5051. Please make sure that host is reachable
from your master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message