mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nan Xiao <xiaonan830...@gmail.com>
Subject Re: The issue of "Failed to shutdown socket with fd xx: Transport endpoint is not connected" on Mesos master
Date Tue, 29 Dec 2015 09:37:03 GMT
BTW, using "lsof" command finds there are only 16 file descriptors. I
don't know why Mesos
master try to close "fd 17".
Best Regards
Nan Xiao


On Tue, Dec 29, 2015 at 11:32 AM, Nan Xiao <xiaonan830818@gmail.com> wrote:
> Hi Klaus,
>
> Firstly, thanks very much for your answer!
>
> The km processes are all live:
> root     129474 128024  2 22:26 pts/0    00:00:00 km apiserver
> --address=15.242.100.60 --etcd-servers=http://15.242.100.60:4001
> --service-cluster-ip-range=10.10.10.0/24 --port=8888
> --cloud-provider=mesos --cloud-config=mesos-cloud.conf --secure-port=0
> --v=1
> root     129509 128024  2 22:26 pts/0    00:00:00 km
> controller-manager --master=15.242.100.60:8888 --cloud-provider=mesos
> --cloud-config=./mesos-cloud.conf --v=1
> root     129538 128024  0 22:26 pts/0    00:00:00 km scheduler
> --address=15.242.100.60 --mesos-master=15.242.100.56:5050
> --etcd-servers=http://15.242.100.60:4001 --mesos-user=root
> --api-servers=15.242.100.60:8888 --cluster-dns=10.10.10.10
> --cluster-domain=cluster.local --v=2
>
> All the logs are also seem OK, except the logs from scheduler.log:
> ......
> I1228 22:26:37.883092  129538 messenger.go:381] Receiving message
> mesos.internal.InternalMasterChangeDetected from
> scheduler(1)@15.242.100.60:33077
> I1228 22:26:37.883225  129538 scheduler.go:374] New master
> master@15.242.100.56:5050 detected
> I1228 22:26:37.883268  129538 scheduler.go:435] No credentials were
> provided. Attempting to register scheduler without authentication.
> I1228 22:26:37.883356  129538 scheduler.go:928] Registering with
> master: master@15.242.100.56:5050
> I1228 22:26:37.883460  129538 messenger.go:187] Sending message
> mesos.internal.RegisterFrameworkMessage to master@15.242.100.56:5050
> I1228 22:26:37.883504  129538 scheduler.go:881] will retry
> registration in 1.209320575s if necessary
> I1228 22:26:37.883758  129538 http_transporter.go:193] Sending message
> to master@15.242.100.56:5050 via http
> I1228 22:26:37.883873  129538 http_transporter.go:587] libproc target
> URL http://15.242.100.56:5050/master/mesos.internal.RegisterFrameworkMessage
> I1228 22:26:39.093560  129538 scheduler.go:928] Registering with
> master: master@15.242.100.56:5050
> I1228 22:26:39.093659  129538 messenger.go:187] Sending message
> mesos.internal.RegisterFrameworkMessage to master@15.242.100.56:5050
> I1228 22:26:39.093702  129538 scheduler.go:881] will retry
> registration in 3.762036352s if necessary
> I1228 22:26:39.093765  129538 http_transporter.go:193] Sending message
> to master@15.242.100.56:5050 via http
> I1228 22:26:39.093847  129538 http_transporter.go:587] libproc target
> URL http://15.242.100.56:5050/master/mesos.internal.RegisterFrameworkMessage
> ......
>
> From the log, the Mesos master rejected the k8s's registeration, and
> k8s retry constantly.
>
> Have you met this issue before? Thanks very much in advance!
> Best Regards
> Nan Xiao
>
>
> On Mon, Dec 28, 2015 at 7:26 PM, Klaus Ma <klaus1982.cn@gmail.com> wrote:
>> It seems Kubernetes is down; would you help to check kubernetes's status
>> (km)?
>>
>> ----
>> Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer
>> Platform Symphony/DCOS Development & Support, STG, IBM GCG
>> +86-10-8245 4084 | klaus1982.cn@gmail.com | http://k82.me
>>
>> On Mon, Dec 28, 2015 at 6:35 PM, Nan Xiao <xiaonan830818@gmail.com> wrote:
>>>
>>> Hi all,
>>>
>>> Greetings from me!
>>>
>>> I am trying to follow this tutorial
>>>
>>> (https://github.com/kubernetes/kubernetes/blob/master/docs/getting-started-guides/mesos.md)
>>> to deploy "k8s on Mesos" on local machines: The k8s is the newest
>>> master branch, and Mesos is the 0.26 edition.
>>>
>>> After running Mesos master(IP:15.242.100.56), Mesos
>>> slave(IP:15.242.100.16),, and the k8s(IP:15.242.100.60), I can see the
>>> following logs from Mesos master:
>>>
>>> ......
>>> I1227 22:52:34.494478  8069 master.cpp:4269] Received update of slave
>>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-S0 at slave(1)@15.242.100.16:5051
>>> (pqsfc016.ftc.rdlabs.hpecorp.net) with total oversubscribed resources
>>> I1227 22:52:34.494940  8065 hierarchical.cpp:400] Slave
>>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-S0
>>> (pqsfc016.ftc.rdlabs.hpecorp.net) updated with oversubscribed
>>> resources  (total: cpus(*):32; mem(*):127878; disk(*):4336;
>>> ports(*):[31000-32000], allocated: )
>>> I1227 22:53:06.740757 8053 http.cpp:334] HTTP GET for
>>> /master/state.json from 15.242.100.60:56219 with
>>> User-Agent='Go-http-client/1.1'
>>> I1227 22:53:07.736419 8065 http.cpp:334] HTTP GET for
>>> /master/state.json from 15.242.100.60:56241 with
>>> User-Agent='Go-http-client/1.1'
>>> I1227 22:53:07.767196  8070 http.cpp:334] HTTP GET for
>>> /master/state.json from 15.242.100.60:56252 with
>>> User-Agent='Go-http-client/1.1'
>>> I1227 22:53:08.808171  8053 http.cpp:334] HTTP GET for
>>> /master/state.json from 15.242.100.60:56272 with
>>> User-Agent='Go-http-client/1.1'
>>> I1227 22:53:08.815811 8060 master.cpp:2176] Received SUBSCRIBE call
>>> for framework 'Kubernetes' at scheduler(1)@15.242.100.60:59488
>>> I1227 22:53:08.816182 8060 master.cpp:2247] Subscribing framework
>>> Kubernetes with checkpointing enabled and capabilities [  ]
>>> I1227 22:53:08.817294  8052 hierarchical.cpp:195] Added framework
>>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000
>>> I1227 22:53:08.817464  8050 master.cpp:1122] Framework
>>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 (Kubernetes) at
>>> scheduler(1)@15.242.100.60:59488 disconnected
>>> E1227 22:53:08.817497 8073 process.cpp:1911] Failed to shutdown
>>> socket with fd 17: Transport endpoint is not connected
>>> I1227 22:53:08.817533  8050 master.cpp:2472] Disconnecting framework
>>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 (Kubernetes) at
>>> scheduler(1)@15.242.100.60:59488
>>> I1227 22:53:08.817595 8050 master.cpp:2496] Deactivating framework
>>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 (Kubernetes) at
>>> scheduler(1)@15.242.100.60:59488
>>> I1227 22:53:08.817797 8050 master.cpp:1146] Giving framework
>>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 (Kubernetes) at
>>> scheduler(1)@15.242.100.60:59488 7625.14222623576weeks to failover
>>> W1227 22:53:08.818389 8062 master.cpp:4840] Master returning
>>> resources offered to framework
>>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 because the framework has
>>> terminated or is inactive
>>> I1227 22:53:08.818397  8052 hierarchical.cpp:273] Deactivated
>>> framework 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000
>>> I1227 22:53:08.819046  8066 hierarchical.cpp:744] Recovered
>>> cpus(*):32; mem(*):127878; disk(*):4336; ports(*):[31000-32000]
>>> (total: cpus(*):32; mem(*):127878; disk(*):4336;
>>> ports(*):[31000-32000], allocated: ) on slave
>>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-S0 from framework
>>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000
>>> ......
>>>
>>> I can't figure out why Mesos master complains "Failed to shutdown
>>> socket with fd 17: Transport endpoint is not connected".
>>> Could someone give some clues on this issue?
>>>
>>> Thanks very much in advance!
>>>
>>> Best Regards
>>> Nan Xiao
>>
>>

Mime
View raw message