cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Weller <swel...@ena.com>
Subject Re: Router VM: patchviasocket.py timeout issue on 1 out of 4 networks
Date Tue, 13 Dec 2016 02:31:41 GMT
Can you turn on agent debug mode and take a look at the debug level logs?


You can do that by running sed -i 's/INFO/DEBUG/g' /etc/cloudstack/agent/log4j-cloud.xml on
the host and then restarting the agent.


- Si




________________________________
From: Syahrul Sazli Shaharir <sazli@pulasan.my>
Sent: Monday, December 12, 2016 8:21 PM
To: users@cloudstack.apache.org
Subject: Router VM: patchviasocket.py timeout issue on 1 out of 4 networks

Hi,

I am running latest Cloudstack 4.9.0.1 on CentOS 7 KVM + ceph
environment. After running for some time, I faced with an issue with
one out of 4 networks - following a heartbeat-induced reset on all
hosts, the associated virtual router would not get recreated and
started properly on any of the 3 hosts I have, even after repeated
attempts of the following:-
- destroy-recreate cycles, via Cloudstack UI
- restartNetwork cleanup=true API calls (failed with errorcode = 530).
- redownload and reregister system VM template as another entry and
assign to router VM in global setting (boots the new template OK, but
still same problem)
- tweak default system offering for router VM (increased RAM from 256 to 512MB)
- created new system offering, with RAM tweak, and use of ceph rbd
store, and assigned it to Cloud.Com-SoftwareRouter as per docs - which
didnt work for some reason: it kept on using initial default offering
and created image on local host storage
- upgrade to latest cloudstack (previously was running 4.8)

As with a handful of others in this list archives, virsh list and
dumpxml shows the VM created OK but failed soon after booting, as
found in the following error in agent.log :-

2016-12-13 10:03:33,894 WARN  [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-1:null) (logid:633e6e03) Timed out:
/usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.py
-n r-668-VM -p %template=domP%name=r-668-VM%eth0ip=10.3.28.10%eth0mask=255.255.255.0%gateway=10.3.28.1%domain=nocser.net%cidrsize=24%dhcprange=10.3.28.1%eth1ip=169.254.0.33%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=8.8.8.8%dns2=8.8.4.4%ip6dns1=%ip6dns2=%baremetalnotificationsecuritykey=uavJByNGGjNLrELG-qbdN99__1I3tnp8qa0KbcsKokKJcPB43K9s6oQu2nMLqo3YP8p6jqDy5XT3WWOWBA2yNw%baremetalnotificationapikey=8JH4mdkxsEMhgIBgMonkNXAEKjVOeZnG1m5UVekvvo4v_iXQ4ZS7rh6NNS0qphhc7ZrCauiz23tp2-Wa3AASlg%host=10.2.30.11%port=8080
.  Output is:
.....
2016-12-13 10:05:45,895 WARN  [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-1:null) (logid:633e6e03) Timed out:
/usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh
vr_cfg.sh 169.254.0.33 -c
/var/cache/cloud/VR-48ea8a95-6c02-499f-88d3-eae5bf9f9fbe.cfg .  Output
is:

As mentioned, this only happens with 1 network (always the same
network). The other router VMs work OK. Any clues on how to
troubleshoot this further, would be greatly appreciated.

Thanks.

--
--sazli
Syahrul Sazli Shaharir <sazli@pulasan.my>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message