cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc-Andre Jutras <mar...@marcuspocus.com>
Subject Re: cs 4.5.1, hosts stuck in disconnected status
Date Thu, 21 Jul 2016 18:09:31 GMT
Hey Francois,

here is some suggestion...

Did you have any load balancer in front of your 3 CSMAN servers? if so, 
is there any persistence defined in your configuration ? Can you try to 
set it to SourceIP and fix the timeout to something like 60 or 120 min ?

Also validate these points:

under global settings / host, make sure your Xen hosts, VM or System VM 
can reach the IP defined there...

iptables : make sure these tcp port are open on each of your CSMAN 
servers... : 8080, 8096, 8250, 9090 ( and also validate that you got 
these ports open on your Load balancer too... )

if your zone is set to Advanced mode, make sure each of your xenserver 
is running openvswitch ( xe-switch-network-backend openvswitch ) if not, 
( basic mode ) set it to bridge... ( xe-switch-network-backend bridge ) 
( more info: 
http://docs.cloudstack.apache.org/projects/cloudstack-installation/en/4.6/hypervisor/xenserver.html#install-cloudstack-xenserver-support-package-csp

)

check also each iptables definition in each of your xen server, to test, 
flush all tables and check if Cloudstack can connect correctly to it... 
( iptables -F      iptables definition in : /etc/sysconfig/iptables )

you can also try to delete one xenhost and re-add it to cloudstack and 
check in the CS logs if you're seeing some files copied to the host...

try that and keep us posted !

Marcus


On 2016-07-21 10:50 AM, Scheurer François wrote:
> Dear Stephan and Dag,
>
> we also thought about it and checked it but the host was already enabled on xen.
>
> Best Regards
> Francois
>
>
>
> EveryWare AG
> François Scheurer
> Senior Systems Engineer
>
> -----Original Message-----
> From: Dag Sonstebo [mailto:Dag.Sonstebo@shapeblue.com]
> Sent: Thursday, July 21, 2016 1:23 PM
> To: users@cloudstack.apache.org
> Subject: Re: cs 4.5.1, hosts stuck in disconnected status
>
> Hi Francois,
>
> As pointed out by Stephan the problem is probably with your Xen cluster rather than your
CloudStack management. On the disconnected host you may want to carry out a xe-toolstack-restart
- this will restart Xapi without affecting running Vms. After that check your cluster with
‘xe host-list’ etc. If this doesn’t help you may have to consider restarting the host.
>
> Regards,
> Dag Sonstebo
> Cloud Architect
> ShapeBlue
>
>
>
>
>
>
>
> On 21/07/2016, 11:25, "Francois Scheurer" <francois.scheurer@everyware.ch> wrote:
>
>> Dear CS contributors
>>
>>
>> We could fix the issue without having to restart the disconnected Xen Hosts.
>> We suspect that the root cause was a interrupted agent transfer, during
>> a restart of a Managment Server (CSMAN).
>>
>> We have 3 CSMAN's running in cluster: man01, man02 and man03.
>> The disconnected vh010 belongs to one Xen Hosts Cluster with 4 nodes:
>> vh009, vh010, vh011 and vh012.
>> See the chronological events from the logs with our comments regarding
>> the disconnection of vh010:
>>
>> ===>vh010 (host 19) was on agent 345049103441 (man02)
>>      vh010: Last Disconnected   2016-07-18T14:03:50+0200
>>      345049098498 = man01
>>      345049103441 = man02
>>      345049098122 = man03
>>
>>      ewcstack-man02-prod:
>>          2016-07-18T14:00:34.878973+02:00 ewcstack-man02-prod [audit
>> root/10467 as root/10467 on
>> pts/1/192.168.252.77:36251->192.168.225.72:22] /root: service
>> cloudstack-management restart; service cloudstack-usage restart
>>
>>      ewcstack-man02-prod:
>>          2016-07-18 14:02:15,797 DEBUG [c.c.s.StorageManagerImpl]
>> (StorageManager-Scavenger-1:ctx-ea98efd4) Storage pool garbage collector
>> found 0 templates to clean up in storage pool: ewcstack-vh010-prod Local
>> Storage
>>      !    2016-07-18 14:02:26,699 DEBUG
>> [c.c.a.m.ClusteredAgentManagerImpl] (StatsCollector-1:ctx-7da7a491) Host
>> 19 has switched to another management server, need to update agent map
>> with a forwarding agent attache
>>
>>      ewcstack-man01-prod:
>>          2016-07-18T14:02:47.317644+02:00 ewcstack-man01-prod [audit
>> root/11094 as root/11094 on
>> pts/0/192.168.252.77:40654->192.168.225.71:22] /root: service
>> cloudstack-management restart; service cloudstack-usage restart;
>>
>>      ewcstack-man02-prod:
>>          2016-07-18 14:03:24,859 DEBUG [c.c.s.StorageManagerImpl]
>> (StorageManager-Scavenger-1:ctx-c39aaa53) Storage pool garbage collector
>> found 0 templates to clean up in storage pool: ewcstack-vh010-prod Local
>> Storage
>>
>>      ewcstack-man02-prod:
>>          2016-07-18 14:03:26,260 DEBUG [c.c.a.m.AgentManagerImpl]
>> (AgentManager-Handler-6:null) SeqA 256-29401: Sending Seq 256-29401:  {
>> Ans: , MgmtId: 345049103441, via: 256, Ver: v1, Flags: 100010,
>> [{"com.cloud.agent.api.AgentControlAnswer":{"result":true,"wait":0}}] }
>>          2016-07-18 14:03:28,535 DEBUG [c.c.s.StatsCollector]
>> (StatsCollector-1:ctx-814f1ae1) HostStatsCollector is running...
>>          2016-07-18 14:03:28,553 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> (StatsCollector-1:ctx-814f1ae1) Seq 7-6771162039751540742: Forwarding
>> null to 345049098122
>>          2016-07-18 14:03:28,661 DEBUG [c.c.a.m.AgentManagerImpl]
>> (AgentManager-Handler-7:null) SeqA 244-153489: Processing Seq
>> 244-153489:  { Cmd , MgmtId: -1, via: 244, Ver: v1, Flags: 11,
>> [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":1456,"_loadInfo":"{\n
>> \"connections\": []\n}","wait":0}}] }
>>          2016-07-18 14:03:28,667 DEBUG [c.c.a.m.AgentManagerImpl]
>> (AgentManager-Handler-7:null) SeqA 244-153489: Sending Seq 244-153489:
>> { Ans: , MgmtId: 345049103441, via: 244, Ver: v1, Flags: 100010,
>> [{"com.cloud.agent.api.AgentControlAnswer":{"result":true,"wait":0}}] }
>>          2016-07-18 14:03:28,731 DEBUG [c.c.a.t.Request]
>> (StatsCollector-1:ctx-814f1ae1) Seq 7-6771162039751540742: Received:  {
>> Ans: , MgmtId: 345049103441, via: 7, Ver: v1, Flags: 10, {
>> GetHostStatsAnswer } }
>> ===>11 = vh006, 345049098122 = man03, vh006 is transfered to man03:
>>          2016-07-18 14:03:28,744 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> (StatsCollector-1:ctx-814f1ae1) Seq 11-5143110774457106438: Forwarding
>> null to 345049098122
>>          2016-07-18 14:03:28,838 DEBUG [c.c.a.t.Request]
>> (StatsCollector-1:ctx-814f1ae1) Seq 11-5143110774457106438: Received:  {
>> Ans: , MgmtId: 345049103441, via: 11, Ver: v1, Flags: 10, {
>> GetHostStatsAnswer } }
>> ===>19 = vh010, 345049098498 = man01, vh010 is transfered to man01, but
>> man01 is stopping and starting at 14:02:47, so the transfer failed:
>>      !    2016-07-18 14:03:28,851 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> (StatsCollector-1:ctx-814f1ae1) Seq 19-2009731333714083845: Forwarding
>> null to 345049098498
>>          2016-07-18 14:03:28,852 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> (StatsCollector-1:ctx-814f1ae1) Seq 19-2009731333714083845: Error on
>> connecting to management node: null try = 1
>>          2016-07-18 14:03:28,852 INFO [c.c.a.m.ClusteredAgentAttache]
>> (StatsCollector-1:ctx-814f1ae1) IOException Broken pipe when sending
>> data to peer 345049098498, close peer connection and let it re-open
>>          2016-07-18 14:03:28,856 WARN  [c.c.a.m.AgentManagerImpl]
>> (StatsCollector-1:ctx-814f1ae1) Exception while sending
>>          java.lang.NullPointerException
>>                  at
>> com.cloud.agent.manager.ClusteredAgentManagerImpl.connectToPeer(ClusteredAgentManagerImpl.java:527)
>>                  at
>> com.cloud.agent.manager.ClusteredAgentAttache.send(ClusteredAgentAttache.java:177)
>>                  at
>> com.cloud.agent.manager.AgentAttache.send(AgentAttache.java:395)
>>                  at
>> com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:433)
>>                  at
>> com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:362)
>>                  at
>> com.cloud.agent.manager.AgentManagerImpl.easySend(AgentManagerImpl.java:919)
>>                  at
>> com.cloud.resource.ResourceManagerImpl.getHostStatistics(ResourceManagerImpl.java:2460)
>>                  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>> Method)
>>                  at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>                  at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>                  at java.lang.reflect.Method.invoke(Method.java:606)
>>                  at
>> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
>>                  at
>> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>>                  at
>> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
>>                  at
>> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
>>                  at
>> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
>>                  at
>> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
>>                  at com.sun.proxy.$Proxy149.getHostStatistics(Unknown
>> Source)
>>                  at
>> com.cloud.server.StatsCollector$HostCollector.runInContext(StatsCollector.java:325)
>>                  at
>> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>>                  at
>> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>>                  at
>> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>>                  at
>> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>>                  at
>> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>                  at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>                  at
>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>>                  at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>>                  at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>                  at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>                  at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>                  at java.lang.Thread.run(Thread.java:745)
>>          2016-07-18 14:03:28,857 WARN  [c.c.r.ResourceManagerImpl]
>> (StatsCollector-1:ctx-814f1ae1) Unable to obtain host 19 statistics.
>>          2016-07-18 14:03:28,857 WARN  [c.c.s.StatsCollector]
>> (StatsCollector-1:ctx-814f1ae1) Received invalid host stats for host: 19
>>
>>          2016-07-18 14:03:28,870 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> (StatsCollector-1:ctx-814f1ae1) Seq 21-6297439653947506693: Error on
>> connecting to management node: null try = 1
>>          2016-07-18 14:03:28,887 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> (StatsCollector-1:ctx-814f1ae1) Seq 25-2894407185515675660: Error on
>> connecting to management node: null try = 1
>>          2016-07-18 14:03:28,903 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> (StatsCollector-1:ctx-814f1ae1) Seq 29-4279264070932103175: Error on
>> connecting to management node: null try = 1
>>          2016-07-18 14:03:28,919 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> (StatsCollector-1:ctx-814f1ae1) Seq 33-123567514775977989: Error on
>> connecting to management node: null try = 1
>>          2016-07-18 14:03:29,057 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> (StatsCollector-1:ctx-814f1ae1) Seq 224-4524428775647084550: Error on
>> connecting to management node: null try = 1
>>          2016-07-18 14:03:29,170 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> (StatsCollector-1:ctx-814f1ae1) Seq 19-2009731333714083846: Error on
>> connecting to management node: null try = 1
>> ===>vh010 is invalid and stays disconnected:
>>      !    2016-07-18 14:03:29,174 WARN  [c.c.r.ResourceManagerImpl]
>> (StatsCollector-1:ctx-814f1ae1) Unable to obtain GPU stats for host
>> ewcstack-vh010-prod
>>          2016-07-18 14:03:29,183 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> (StatsCollector-1:ctx-814f1ae1) Seq 21-6297439653947506694: Error on
>> connecting to management node: null try = 1
>>          2016-07-18 14:03:29,196 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> (StatsCollector-1:ctx-814f1ae1) Seq 25-2894407185515675661: Error on
>> connecting to management node: null try = 1
>>          2016-07-18 14:03:29,212 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> (StatsCollector-1:ctx-814f1ae1) Seq 29-4279264070932103176: Error on
>> connecting to management node: null try = 1
>>          2016-07-18 14:03:29,226 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> (StatsCollector-1:ctx-814f1ae1) Seq 33-123567514775977990: Error on
>> connecting to management node: null try = 1
>>          2016-07-18 14:03:29,282 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> (StatsCollector-1:ctx-814f1ae1) Seq 224-4524428775647084551: Error on
>> connecting to management node: null try = 1
>>          2016-07-18 14:03:30,246 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> (StatsCollector-2:ctx-942dd66c) Seq 19-2009731333714083847: Error on
>> connecting to management node: null try = 1
>>          2016-07-18 14:03:30,302 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> (StatsCollector-2:ctx-942dd66c) Seq 21-6297439653947506695: Error on
>> connecting to management node: null try = 1
>>          2016-07-18 14:03:30,352 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> (StatsCollector-2:ctx-942dd66c) Seq 25-2894407185515675662: Error on
>> connecting to management node: null try = 1
>>          2016-07-18 14:03:30,381 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> (StatsCollector-2:ctx-942dd66c) Seq 29-4279264070932103177: Error on
>> connecting to management node: null try = 1
>>          2016-07-18 14:03:30,421 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> (StatsCollector-2:ctx-942dd66c) Seq 33-123567514775977991: Error on
>> connecting to management node: null try = 1
>>          2016-07-18 14:03:30,691 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> (StatsCollector-2:ctx-942dd66c) Seq 224-4524428775647084552: Error on
>> connecting to management node: null try = 1
>>
>> The Table op_host_transfer shows 3 Transfers, that were not completed:
>> für id 3,15,19 = vh007, vh011, vh010:
>>
>>      mysql> select * from op_host_transfer ;
>> +-----+------------------------+-----------------------+-------------------+---------------------+
>>      | id  | initial_mgmt_server_id | future_mgmt_server_id |
>> state             | created             |
>> +-----+------------------------+-----------------------+-------------------+---------------------+
>>      |   3 |           345049103441 |          345049098122 |
>> TransferRequested | 2016-07-13 14:46:57 |
>>      |  15 |           345049103441 |          345049098122 |
>> TransferRequested | 2016-07-14 16:15:11 |
>>      |  19 |           345049098498 |          345049103441 |
>> TransferRequested | 2016-07-18 12:03:39 |
>>      | 130 |           345049103441 |          345049098498 |
>> TransferRequested | 2016-07-13 14:52:00 |
>>      | 134 |           345049103441 |          345049098498 |
>> TransferRequested | 2016-07-03 08:54:40 |
>>      | 150 |           345049103441 |          345049098498 |
>> TransferRequested | 2016-07-13 14:52:00 |
>>      | 158 |           345049103441 |          345049098498 |
>> TransferRequested | 2016-07-03 08:54:41 |
>>      | 221 |           345049103441 |          345049098498 |
>> TransferRequested | 2016-07-13 14:52:00 |
>>      | 232 |           345049103441 |          345049098498 |
>> TransferRequested | 2016-07-03 08:54:41 |
>>      | 244 |           345049103441 |          345049098498 |
>> TransferRequested | 2016-07-13 14:52:00 |
>>      | 248 |           345049103441 |          345049098498 |
>> TransferRequested | 2016-07-03 08:54:41 |
>>      | 250 |           345049098122 |          345049103441 |
>> TransferRequested | 2016-07-15 18:54:35 |
>>      | 251 |           345049103441 |          345049098122 |
>> TransferRequested | 2016-07-16 09:06:12 |
>>      | 252 |           345049103441 |          345049098122 |
>> TransferRequested | 2016-07-18 11:22:06 |
>>      | 253 |           345049103441 |          345049098122 |
>> TransferRequested | 2016-07-16 09:06:13 |
>>      | 254 |           345049103441 |          345049098122 |
>> TransferRequested | 2016-07-18 11:22:07 |
>>      | 255 |           345049098122 |          345049098498 |
>> TransferRequested | 2016-07-18 12:05:40 |
>> +-----+------------------------+-----------------------+-------------------+---------------------+
>>
>>
>> Analysis:
>> A rolling restart of all 3 CSMANs (one-by-one) seems to have caused
>> these 3 uncompleted transfers and seems to be the cause of the hosts
>> stucked in Disconnected status.
>>
>> If we stop all CSMAN's and start a single one (for ex. man03), then
>> these 3 uncompleted transfers disappeared and the hosts get connected
>> automatically.
>> It is probably also possible to delete them manually in the
>> op_host_transfer. (can you confirm this?)
>>
>> We also discovered an issue with loopback devices that are not removed
>> after a stop of the CMSAN.
>>
>>
>> Conclusion:
>>
>> Problem: xen hosts get and stay forever disconnected.
>> Solution:
>>      stop all CSMAN
>>          losetup -a
>>          losetup -d /dev/loop{0..7}
>>          mysql> update host set
>> status="Up",resource_state="Enabled",mgmt_server_id=<CSMAN-ID> where
>> id=<HOST-ID>;
>>          mysql> update op_host_capacity set capacity_state="Enabled"
>> where host_id=<HOST-ID>;
>>          mysql> delete op_host_transfer where id=<HOST-ID>;
>>      optional:
>>          on xen server host:
>>              xe-toolstack-restart; sleep 60
>>              xe host-list params=enabled
>>              xe host-enable host=<hostname>
>>      start a single CSMAN
>>      restart all System VM's (Secondary Storage and Console Proxy)
>>      wait until all hosts are connected
>>      start all other CSMAN's
>> Useful:
>>      mysql> select id,name,uuid,status,type, mgmt_server_id from host
>> where removed is NULL;
>>      mysql> select * from mshost;
>>      mysql> select * from op_host_transfer;
>>      mysql> select * from mshost where removed is NULL;
>>      mysql> select * from host_tags;
>>      mysql> select * from mshost_peer;
>>      mysql> select * from op_host_capacity order by host_id;
>>
>>
>>
>> Best regards
>> Francois Scheurer
>>
>> On 21.07.2016 11:56, Francois Scheurer wrote:
>>> Dear CS contributors
>>>
>>>
>>> We use CS 4.5.1 on a 3 Clusters with XenServer 6.5.
>>>
>>> One Host in a cluster (and another in another cluster as well) got and
>>> stayed in status "Disconnected".
>>>
>>> We tried to unmanage/remanage the cluster to force a reconnection, we
>>> also destroyed all System VM's (virtual console and secondary storage
>>> VM's), we restarted all management servers.
>>> We verified on the xen server that it is not disabled, we restarted
>>> the xen toolstack.
>>> We also updated the host table to put a mgmt_server_id: update host
>>> set
>>> status="Up",resource_state="Disabled",mgmt_server_id="345049103441"
>>> where id=15;
>>> Then we restarted the management servers again and also the System VM's.
>>> We finally updated the table to without mgmt_server_id: update host
>>> set status="Alert",resource_state="Disabled",mgmt_server_id=NULL where
>>> id=15;
>>> Then we restarted the management servers again and also the System VM's.
>>> Nothing helps, the server does not reconnect.
>>>
>>> Calling ForceReconnect shows this error:
>>>
>>> 2016-07-18 11:26:07,418 DEBUG [c.c.a.ApiServlet]
>>> (catalina-exec-13:ctx-4e82fdce) ===START===  192.168.252.77 -- GET
>>> command=reconnectHost&id=3490cfa0-b2a7-4a12-aa5e-7e351ce9df00&response=json&sessionkey=Tnc9l6aaSvc8J5SNy3Z71FLXgEI%3D&_=1468833953948
>>>
>>> 2016-07-18 11:26:07,450 INFO [o.a.c.f.j.i.AsyncJobMonitor]
>>> (API-Job-Executor-23:ctx-fc340a8e job-148672) Add job-148672 into job
>>> monitoring
>>> 2016-07-18 11:26:07,453 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> (catalina-exec-13:ctx-4e82fdce ctx-9c696de2) submit async job-148672,
>>> details: AsyncJobVO {id:148672, userId: 51, accountId: 51,
>>> instanceType: Host, instanceId: 15, cmd:
>>> org.apache.cloudstack.api.command.admin.host.ReconnectHostCmd,
>>> cmdInfo:
>>> {"id":"3490cfa0-b2a7-4a12-aa5e-7e351ce9df00","response":"json","sessionkey":"Tnc9l6aaSvc8J5SNy3Z71FLXgEI\u003d","ctxDetails":"{\"com.cloud.host.Host\":\"3490cfa0-b2a7-4a12-aa5e-7e351ce9df00\"}","cmdEventType":"HOST.RECONNECT","ctxUserId":"51","httpmethod":"GET","_":"1468833953948","uuid":"3490cfa0-b2a7-4a12-aa5e-7e351ce9df00","ctxAccountId":"51","ctxStartEventId":"18026840"},
>>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
>>> result: null, initMsid: 345049098122, completeMsid: null, lastUpdated:
>>> null, lastPolled: null, created: null}
>>> 2016-07-18 11:26:07,454 DEBUG [c.c.a.ApiServlet]
>>> (catalina-exec-13:ctx-4e82fdce ctx-9c696de2) ===END=== 192.168.252.77
>>> -- GET
>>> command=reconnectHost&id=3490cfa0-b2a7-4a12-aa5e-7e351ce9df00&response=json&sessionkey=Tnc9l6aaSvc8J5SNy3Z71FLXgEI%3D&_=1468833953948
>>> 2016-07-18 11:26:07,455 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> (API-Job-Executor-23:ctx-fc340a8e job-148672) Executing AsyncJobVO
>>> {id:148672, userId: 51, accountId: 51, instanceType: Host, instanceId:
>>> 15, cmd:
>>> org.apache.cloudstack.api.command.admin.host.ReconnectHostCmd,
>>> cmdInfo:
>>> {"id":"3490cfa0-b2a7-4a12-aa5e-7e351ce9df00","response":"json","sessionkey":"Tnc9l6aaSvc8J5SNy3Z71FLXgEI\u003d","ctxDetails":"{\"com.cloud.host.Host\":\"3490cfa0-b2a7-4a12-aa5e-7e351ce9df00\"}","cmdEventType":"HOST.RECONNECT","ctxUserId":"51","httpmethod":"GET","_":"1468833953948","uuid":"3490cfa0-b2a7-4a12-aa5e-7e351ce9df00","ctxAccountId":"51","ctxStartEventId":"18026840"},
>>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
>>> result: null, initMsid: 345049098122, completeMsid: null, lastUpdated:
>>> null, lastPolled: null, created: null}
>>> 2016-07-18 11:26:07,461 DEBUG [c.c.a.m.DirectAgentAttache]
>>> (DirectAgent-495:ctx-77e68e88) Seq 213-6743858967010618892: Executing
>>> request
>>> 2016-07-18 11:26:07,467 INFO  [c.c.a.m.AgentManagerImpl]
>>> (API-Job-Executor-23:ctx-fc340a8e job-148672 ctx-0061c491) Unable to
>>> disconnect host because it is not connected to this server: 15
>>> 2016-07-18 11:26:07,467 WARN [o.a.c.a.c.a.h.ReconnectHostCmd]
>>> (API-Job-Executor-23:ctx-fc340a8e job-148672 ctx-0061c491) Exception:
>>> org.apache.cloudstack.api.ServerApiException: Failed to reconnect host
>>>      at
>>> org.apache.cloudstack.api.command.admin.host.ReconnectHostCmd.execute(ReconnectHostCmd.java:109)
>>>      at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:141)
>>>      at
>>> com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:108)
>>>      at
>>> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:537)
>>>      at
>>> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>>>      at
>>> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>>>      at
>>> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>>>      at
>>> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>>>      at
>>> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>>      at
>>> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>>>      at
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>      at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>      at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>      at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>      at java.lang.Thread.run(Thread.java:745)
>>>
>>> Connecting via SSH from the management server is fine, for ex.:
>>>    [root@ewcstack-man03-prod ~]# ssh -i
>>> /var/cloudstack/management/.ssh/id_rsa root@ewcstack-vh011-prod
>>> "/opt/cloud/bin/router_proxy.sh netusage.sh 169.254.2.103 -g"
>>>    root@ewcstack-vh011-prod's password:
>>>    2592:0:0:0:[root@ewcstack-man03-prod ~]#
>>>
>>>
>>> Any Idea how to solve this issue and how to track the reason of the
>>> failure to reconnect?
>>>
>>> Many thanks in advance for your help.
>>>
>>>
>>>
>>> Best Regards
>>> Francois
>>>
>>>
>>>
>>>
>>>
>>>
>> -- 
>>
>>
>> EveryWare AG
>> François Scheurer
>> Senior Systems Engineer
>> Zurlindenstrasse 52a
>> CH-8003 Zürich
>>
>> tel: +41 44 466 60 00
>> fax: +41 44 466 60 10
>> mail: francois.scheurer@everyware.ch
>> web: http://www.everyware.ch
>>
> Dag.Sonstebo@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>    
>   
>


Mime
View raw message