cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Francois Scheurer <francois.scheu...@everyware.ch>
Subject Re: cs 4.5.1, hosts stuck in disconnected status
Date Thu, 21 Jul 2016 10:25:10 GMT
Dear CS contributors


We could fix the issue without having to restart the disconnected Xen Hosts.
We suspect that the root cause was a interrupted agent transfer, during 
a restart of a Managment Server (CSMAN).

We have 3 CSMAN's running in cluster: man01, man02 and man03.
The disconnected vh010 belongs to one Xen Hosts Cluster with 4 nodes: 
vh009, vh010, vh011 and vh012.
See the chronological events from the logs with our comments regarding 
the disconnection of vh010:

===>vh010 (host 19) was on agent 345049103441 (man02)
     vh010: Last Disconnected   2016-07-18T14:03:50+0200
     345049098498 = man01
     345049103441 = man02
     345049098122 = man03

     ewcstack-man02-prod:
         2016-07-18T14:00:34.878973+02:00 ewcstack-man02-prod [audit 
root/10467 as root/10467 on 
pts/1/192.168.252.77:36251->192.168.225.72:22] /root: service 
cloudstack-management restart; service cloudstack-usage restart

     ewcstack-man02-prod:
         2016-07-18 14:02:15,797 DEBUG [c.c.s.StorageManagerImpl] 
(StorageManager-Scavenger-1:ctx-ea98efd4) Storage pool garbage collector 
found 0 templates to clean up in storage pool: ewcstack-vh010-prod Local 
Storage
     !    2016-07-18 14:02:26,699 DEBUG 
[c.c.a.m.ClusteredAgentManagerImpl] (StatsCollector-1:ctx-7da7a491) Host 
19 has switched to another management server, need to update agent map 
with a forwarding agent attache

     ewcstack-man01-prod:
         2016-07-18T14:02:47.317644+02:00 ewcstack-man01-prod [audit 
root/11094 as root/11094 on 
pts/0/192.168.252.77:40654->192.168.225.71:22] /root: service 
cloudstack-management restart; service cloudstack-usage restart;

     ewcstack-man02-prod:
         2016-07-18 14:03:24,859 DEBUG [c.c.s.StorageManagerImpl] 
(StorageManager-Scavenger-1:ctx-c39aaa53) Storage pool garbage collector 
found 0 templates to clean up in storage pool: ewcstack-vh010-prod Local 
Storage

     ewcstack-man02-prod:
         2016-07-18 14:03:26,260 DEBUG [c.c.a.m.AgentManagerImpl] 
(AgentManager-Handler-6:null) SeqA 256-29401: Sending Seq 256-29401:  { 
Ans: , MgmtId: 345049103441, via: 256, Ver: v1, Flags: 100010, 
[{"com.cloud.agent.api.AgentControlAnswer":{"result":true,"wait":0}}] }
         2016-07-18 14:03:28,535 DEBUG [c.c.s.StatsCollector] 
(StatsCollector-1:ctx-814f1ae1) HostStatsCollector is running...
         2016-07-18 14:03:28,553 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(StatsCollector-1:ctx-814f1ae1) Seq 7-6771162039751540742: Forwarding 
null to 345049098122
         2016-07-18 14:03:28,661 DEBUG [c.c.a.m.AgentManagerImpl] 
(AgentManager-Handler-7:null) SeqA 244-153489: Processing Seq 
244-153489:  { Cmd , MgmtId: -1, via: 244, Ver: v1, Flags: 11, 
[{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":1456,"_loadInfo":"{\n

\"connections\": []\n}","wait":0}}] }
         2016-07-18 14:03:28,667 DEBUG [c.c.a.m.AgentManagerImpl] 
(AgentManager-Handler-7:null) SeqA 244-153489: Sending Seq 244-153489:  
{ Ans: , MgmtId: 345049103441, via: 244, Ver: v1, Flags: 100010, 
[{"com.cloud.agent.api.AgentControlAnswer":{"result":true,"wait":0}}] }
         2016-07-18 14:03:28,731 DEBUG [c.c.a.t.Request] 
(StatsCollector-1:ctx-814f1ae1) Seq 7-6771162039751540742: Received:  { 
Ans: , MgmtId: 345049103441, via: 7, Ver: v1, Flags: 10, { 
GetHostStatsAnswer } }
===>11 = vh006, 345049098122 = man03, vh006 is transfered to man03:
         2016-07-18 14:03:28,744 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(StatsCollector-1:ctx-814f1ae1) Seq 11-5143110774457106438: Forwarding 
null to 345049098122
         2016-07-18 14:03:28,838 DEBUG [c.c.a.t.Request] 
(StatsCollector-1:ctx-814f1ae1) Seq 11-5143110774457106438: Received:  { 
Ans: , MgmtId: 345049103441, via: 11, Ver: v1, Flags: 10, { 
GetHostStatsAnswer } }
===>19 = vh010, 345049098498 = man01, vh010 is transfered to man01, but 
man01 is stopping and starting at 14:02:47, so the transfer failed:
     !    2016-07-18 14:03:28,851 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(StatsCollector-1:ctx-814f1ae1) Seq 19-2009731333714083845: Forwarding 
null to 345049098498
         2016-07-18 14:03:28,852 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(StatsCollector-1:ctx-814f1ae1) Seq 19-2009731333714083845: Error on 
connecting to management node: null try = 1
         2016-07-18 14:03:28,852 INFO [c.c.a.m.ClusteredAgentAttache] 
(StatsCollector-1:ctx-814f1ae1) IOException Broken pipe when sending 
data to peer 345049098498, close peer connection and let it re-open
         2016-07-18 14:03:28,856 WARN  [c.c.a.m.AgentManagerImpl] 
(StatsCollector-1:ctx-814f1ae1) Exception while sending
         java.lang.NullPointerException
                 at 
com.cloud.agent.manager.ClusteredAgentManagerImpl.connectToPeer(ClusteredAgentManagerImpl.java:527)
                 at 
com.cloud.agent.manager.ClusteredAgentAttache.send(ClusteredAgentAttache.java:177)
                 at 
com.cloud.agent.manager.AgentAttache.send(AgentAttache.java:395)
                 at 
com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:433)
                 at 
com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:362)
                 at 
com.cloud.agent.manager.AgentManagerImpl.easySend(AgentManagerImpl.java:919)
                 at 
com.cloud.resource.ResourceManagerImpl.getHostStatistics(ResourceManagerImpl.java:2460)
                 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
                 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
                 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                 at java.lang.reflect.Method.invoke(Method.java:606)
                 at 
org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
                 at 
org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
                 at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
                 at 
org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
                 at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
                 at 
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
                 at com.sun.proxy.$Proxy149.getHostStatistics(Unknown 
Source)
                 at 
com.cloud.server.StatsCollector$HostCollector.runInContext(StatsCollector.java:325)
                 at 
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
                 at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
                 at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
                 at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
                 at 
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
                 at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
                 at 
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
                 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
                 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
                 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
                 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
                 at java.lang.Thread.run(Thread.java:745)
         2016-07-18 14:03:28,857 WARN  [c.c.r.ResourceManagerImpl] 
(StatsCollector-1:ctx-814f1ae1) Unable to obtain host 19 statistics.
         2016-07-18 14:03:28,857 WARN  [c.c.s.StatsCollector] 
(StatsCollector-1:ctx-814f1ae1) Received invalid host stats for host: 19

         2016-07-18 14:03:28,870 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(StatsCollector-1:ctx-814f1ae1) Seq 21-6297439653947506693: Error on 
connecting to management node: null try = 1
         2016-07-18 14:03:28,887 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(StatsCollector-1:ctx-814f1ae1) Seq 25-2894407185515675660: Error on 
connecting to management node: null try = 1
         2016-07-18 14:03:28,903 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(StatsCollector-1:ctx-814f1ae1) Seq 29-4279264070932103175: Error on 
connecting to management node: null try = 1
         2016-07-18 14:03:28,919 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(StatsCollector-1:ctx-814f1ae1) Seq 33-123567514775977989: Error on 
connecting to management node: null try = 1
         2016-07-18 14:03:29,057 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(StatsCollector-1:ctx-814f1ae1) Seq 224-4524428775647084550: Error on 
connecting to management node: null try = 1
         2016-07-18 14:03:29,170 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(StatsCollector-1:ctx-814f1ae1) Seq 19-2009731333714083846: Error on 
connecting to management node: null try = 1
===>vh010 is invalid and stays disconnected:
     !    2016-07-18 14:03:29,174 WARN  [c.c.r.ResourceManagerImpl] 
(StatsCollector-1:ctx-814f1ae1) Unable to obtain GPU stats for host 
ewcstack-vh010-prod
         2016-07-18 14:03:29,183 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(StatsCollector-1:ctx-814f1ae1) Seq 21-6297439653947506694: Error on 
connecting to management node: null try = 1
         2016-07-18 14:03:29,196 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(StatsCollector-1:ctx-814f1ae1) Seq 25-2894407185515675661: Error on 
connecting to management node: null try = 1
         2016-07-18 14:03:29,212 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(StatsCollector-1:ctx-814f1ae1) Seq 29-4279264070932103176: Error on 
connecting to management node: null try = 1
         2016-07-18 14:03:29,226 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(StatsCollector-1:ctx-814f1ae1) Seq 33-123567514775977990: Error on 
connecting to management node: null try = 1
         2016-07-18 14:03:29,282 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(StatsCollector-1:ctx-814f1ae1) Seq 224-4524428775647084551: Error on 
connecting to management node: null try = 1
         2016-07-18 14:03:30,246 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(StatsCollector-2:ctx-942dd66c) Seq 19-2009731333714083847: Error on 
connecting to management node: null try = 1
         2016-07-18 14:03:30,302 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(StatsCollector-2:ctx-942dd66c) Seq 21-6297439653947506695: Error on 
connecting to management node: null try = 1
         2016-07-18 14:03:30,352 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(StatsCollector-2:ctx-942dd66c) Seq 25-2894407185515675662: Error on 
connecting to management node: null try = 1
         2016-07-18 14:03:30,381 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(StatsCollector-2:ctx-942dd66c) Seq 29-4279264070932103177: Error on 
connecting to management node: null try = 1
         2016-07-18 14:03:30,421 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(StatsCollector-2:ctx-942dd66c) Seq 33-123567514775977991: Error on 
connecting to management node: null try = 1
         2016-07-18 14:03:30,691 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(StatsCollector-2:ctx-942dd66c) Seq 224-4524428775647084552: Error on 
connecting to management node: null try = 1

The Table op_host_transfer shows 3 Transfers, that were not completed: 
für id 3,15,19 = vh007, vh011, vh010:

     mysql> select * from op_host_transfer ;
+-----+------------------------+-----------------------+-------------------+---------------------+
     | id  | initial_mgmt_server_id | future_mgmt_server_id | 
state             | created             |
+-----+------------------------+-----------------------+-------------------+---------------------+
     |   3 |           345049103441 |          345049098122 | 
TransferRequested | 2016-07-13 14:46:57 |
     |  15 |           345049103441 |          345049098122 | 
TransferRequested | 2016-07-14 16:15:11 |
     |  19 |           345049098498 |          345049103441 | 
TransferRequested | 2016-07-18 12:03:39 |
     | 130 |           345049103441 |          345049098498 | 
TransferRequested | 2016-07-13 14:52:00 |
     | 134 |           345049103441 |          345049098498 | 
TransferRequested | 2016-07-03 08:54:40 |
     | 150 |           345049103441 |          345049098498 | 
TransferRequested | 2016-07-13 14:52:00 |
     | 158 |           345049103441 |          345049098498 | 
TransferRequested | 2016-07-03 08:54:41 |
     | 221 |           345049103441 |          345049098498 | 
TransferRequested | 2016-07-13 14:52:00 |
     | 232 |           345049103441 |          345049098498 | 
TransferRequested | 2016-07-03 08:54:41 |
     | 244 |           345049103441 |          345049098498 | 
TransferRequested | 2016-07-13 14:52:00 |
     | 248 |           345049103441 |          345049098498 | 
TransferRequested | 2016-07-03 08:54:41 |
     | 250 |           345049098122 |          345049103441 | 
TransferRequested | 2016-07-15 18:54:35 |
     | 251 |           345049103441 |          345049098122 | 
TransferRequested | 2016-07-16 09:06:12 |
     | 252 |           345049103441 |          345049098122 | 
TransferRequested | 2016-07-18 11:22:06 |
     | 253 |           345049103441 |          345049098122 | 
TransferRequested | 2016-07-16 09:06:13 |
     | 254 |           345049103441 |          345049098122 | 
TransferRequested | 2016-07-18 11:22:07 |
     | 255 |           345049098122 |          345049098498 | 
TransferRequested | 2016-07-18 12:05:40 |
+-----+------------------------+-----------------------+-------------------+---------------------+


Analysis:
A rolling restart of all 3 CSMANs (one-by-one) seems to have caused 
these 3 uncompleted transfers and seems to be the cause of the hosts 
stucked in Disconnected status.

If we stop all CSMAN's and start a single one (for ex. man03), then 
these 3 uncompleted transfers disappeared and the hosts get connected 
automatically.
It is probably also possible to delete them manually in the 
op_host_transfer. (can you confirm this?)

We also discovered an issue with loopback devices that are not removed 
after a stop of the CMSAN.


Conclusion:

Problem: xen hosts get and stay forever disconnected.
Solution:
     stop all CSMAN
         losetup -a
         losetup -d /dev/loop{0..7}
         mysql> update host set 
status="Up",resource_state="Enabled",mgmt_server_id=<CSMAN-ID> where 
id=<HOST-ID>;
         mysql> update op_host_capacity set capacity_state="Enabled" 
where host_id=<HOST-ID>;
         mysql> delete op_host_transfer where id=<HOST-ID>;
     optional:
         on xen server host:
             xe-toolstack-restart; sleep 60
             xe host-list params=enabled
             xe host-enable host=<hostname>
     start a single CSMAN
     restart all System VM's (Secondary Storage and Console Proxy)
     wait until all hosts are connected
     start all other CSMAN's
Useful:
     mysql> select id,name,uuid,status,type, mgmt_server_id from host 
where removed is NULL;
     mysql> select * from mshost;
     mysql> select * from op_host_transfer;
     mysql> select * from mshost where removed is NULL;
     mysql> select * from host_tags;
     mysql> select * from mshost_peer;
     mysql> select * from op_host_capacity order by host_id;



Best regards
Francois Scheurer

On 21.07.2016 11:56, Francois Scheurer wrote:
> Dear CS contributors
>
>
> We use CS 4.5.1 on a 3 Clusters with XenServer 6.5.
>
> One Host in a cluster (and another in another cluster as well) got and 
> stayed in status "Disconnected".
>
> We tried to unmanage/remanage the cluster to force a reconnection, we 
> also destroyed all System VM's (virtual console and secondary storage 
> VM's), we restarted all management servers.
> We verified on the xen server that it is not disabled, we restarted 
> the xen toolstack.
> We also updated the host table to put a mgmt_server_id: update host 
> set 
> status="Up",resource_state="Disabled",mgmt_server_id="345049103441" 
> where id=15;
> Then we restarted the management servers again and also the System VM's.
> We finally updated the table to without mgmt_server_id: update host 
> set status="Alert",resource_state="Disabled",mgmt_server_id=NULL where 
> id=15;
> Then we restarted the management servers again and also the System VM's.
> Nothing helps, the server does not reconnect.
>
> Calling ForceReconnect shows this error:
>
> 2016-07-18 11:26:07,418 DEBUG [c.c.a.ApiServlet] 
> (catalina-exec-13:ctx-4e82fdce) ===START===  192.168.252.77 -- GET 
> command=reconnectHost&id=3490cfa0-b2a7-4a12-aa5e-7e351ce9df00&response=json&sessionkey=Tnc9l6aaSvc8J5SNy3Z71FLXgEI%3D&_=1468833953948

>
> 2016-07-18 11:26:07,450 INFO [o.a.c.f.j.i.AsyncJobMonitor] 
> (API-Job-Executor-23:ctx-fc340a8e job-148672) Add job-148672 into job 
> monitoring
> 2016-07-18 11:26:07,453 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
> (catalina-exec-13:ctx-4e82fdce ctx-9c696de2) submit async job-148672, 
> details: AsyncJobVO {id:148672, userId: 51, accountId: 51, 
> instanceType: Host, instanceId: 15, cmd: 
> org.apache.cloudstack.api.command.admin.host.ReconnectHostCmd, 
> cmdInfo: 
> {"id":"3490cfa0-b2a7-4a12-aa5e-7e351ce9df00","response":"json","sessionkey":"Tnc9l6aaSvc8J5SNy3Z71FLXgEI\u003d","ctxDetails":"{\"com.cloud.host.Host\":\"3490cfa0-b2a7-4a12-aa5e-7e351ce9df00\"}","cmdEventType":"HOST.RECONNECT","ctxUserId":"51","httpmethod":"GET","_":"1468833953948","uuid":"3490cfa0-b2a7-4a12-aa5e-7e351ce9df00","ctxAccountId":"51","ctxStartEventId":"18026840"},

> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, 
> result: null, initMsid: 345049098122, completeMsid: null, lastUpdated: 
> null, lastPolled: null, created: null}
> 2016-07-18 11:26:07,454 DEBUG [c.c.a.ApiServlet] 
> (catalina-exec-13:ctx-4e82fdce ctx-9c696de2) ===END=== 192.168.252.77 
> -- GET 
> command=reconnectHost&id=3490cfa0-b2a7-4a12-aa5e-7e351ce9df00&response=json&sessionkey=Tnc9l6aaSvc8J5SNy3Z71FLXgEI%3D&_=1468833953948
> 2016-07-18 11:26:07,455 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
> (API-Job-Executor-23:ctx-fc340a8e job-148672) Executing AsyncJobVO 
> {id:148672, userId: 51, accountId: 51, instanceType: Host, instanceId: 
> 15, cmd: 
> org.apache.cloudstack.api.command.admin.host.ReconnectHostCmd, 
> cmdInfo: 
> {"id":"3490cfa0-b2a7-4a12-aa5e-7e351ce9df00","response":"json","sessionkey":"Tnc9l6aaSvc8J5SNy3Z71FLXgEI\u003d","ctxDetails":"{\"com.cloud.host.Host\":\"3490cfa0-b2a7-4a12-aa5e-7e351ce9df00\"}","cmdEventType":"HOST.RECONNECT","ctxUserId":"51","httpmethod":"GET","_":"1468833953948","uuid":"3490cfa0-b2a7-4a12-aa5e-7e351ce9df00","ctxAccountId":"51","ctxStartEventId":"18026840"},

> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, 
> result: null, initMsid: 345049098122, completeMsid: null, lastUpdated: 
> null, lastPolled: null, created: null}
> 2016-07-18 11:26:07,461 DEBUG [c.c.a.m.DirectAgentAttache] 
> (DirectAgent-495:ctx-77e68e88) Seq 213-6743858967010618892: Executing 
> request
> 2016-07-18 11:26:07,467 INFO  [c.c.a.m.AgentManagerImpl] 
> (API-Job-Executor-23:ctx-fc340a8e job-148672 ctx-0061c491) Unable to 
> disconnect host because it is not connected to this server: 15
> 2016-07-18 11:26:07,467 WARN [o.a.c.a.c.a.h.ReconnectHostCmd] 
> (API-Job-Executor-23:ctx-fc340a8e job-148672 ctx-0061c491) Exception:
> org.apache.cloudstack.api.ServerApiException: Failed to reconnect host
>     at 
> org.apache.cloudstack.api.command.admin.host.ReconnectHostCmd.execute(ReconnectHostCmd.java:109)
>     at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:141)
>     at 
> com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:108)
>     at 
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:537)
>     at 
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>     at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>     at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>     at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>     at 
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>     at 
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>     at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
>
> Connecting via SSH from the management server is fine, for ex.:
>   [root@ewcstack-man03-prod ~]# ssh -i 
> /var/cloudstack/management/.ssh/id_rsa root@ewcstack-vh011-prod 
> "/opt/cloud/bin/router_proxy.sh netusage.sh 169.254.2.103 -g"
>   root@ewcstack-vh011-prod's password:
>   2592:0:0:0:[root@ewcstack-man03-prod ~]#
>
>
> Any Idea how to solve this issue and how to track the reason of the 
> failure to reconnect?
>
> Many thanks in advance for your help.
>
>
>
> Best Regards
> Francois
>
>
>
>
>
>

-- 


EveryWare AG
François Scheurer
Senior Systems Engineer
Zurlindenstrasse 52a
CH-8003 Zürich

tel: +41 44 466 60 00
fax: +41 44 466 60 10
mail: francois.scheurer@everyware.ch
web: http://www.everyware.ch


Mime
View raw message