cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Syahrul Sazli Shaharir <sa...@pulasan.my>
Subject Re: Router VM: patchviasocket.py timeout issue on 1 out of 4 networks
Date Sat, 17 Dec 2016 02:54:41 GMT
On Fri, Dec 16, 2016 at 5:16 PM, Dag Sonstebo
<Dag.Sonstebo@shapeblue.com> wrote:
> Hi Syahrul,
>
> It just struck me we had similar issues with patchviasocket.py and python-argparse with
one of our clients a while back, I believe our fix is going into 4.9.1.0:
>
> https://github.com/apache/cloudstack/pull/1634

Hi Dag,

As I'm already running centos 7 with python 2.7, would this still apply?

Thanks.


>
> Regards,
> Dag Sonstebo
> Cloud Architect
> ShapeBlue
>
> On 15/12/2016, 23:09, "Syahrul Sazli Shaharir" <sazli@pulasan.my> wrote:
>
>     Hi Ilya,
>
>     I've looked at the patch suggested, looks like it has been committed
>     into qemu 2.4.0, and I can see the modified parts in the latest qemu
>     2.6.0 code. So I went ahead and installed qemu-kvm-ev-2.6.0-27.1 on
>     one of the hosts. But the problem still persists. Perhaps I should
>     bring this issue to that dev thread.
>
>     Thanks for the help! :)
>
>     On Thu, Dec 15, 2016 at 11:03 AM, ilya <ilya.mailing.lists@gmail.com> wrote:
>     > This will explain a bit more on how this issue came about and how to fix
>     > it..
>     > https://www.mail-archive.com/dev@cloudstack.apache.org/msg71559.html
>     >
>     > On 12/12/16 6:31 PM, Simon Weller wrote:
>     >> Can you turn on agent debug mode and take a look at the debug level logs?
>     >>
>     >>
>     >> You can do that by running sed -i 's/INFO/DEBUG/g' /etc/cloudstack/agent/log4j-cloud.xml
on the host and then restarting the agent.
>     >>
>     >>
>     >> - Si
>     >>
>     >>
>     >>
>     >>
>     >> ________________________________
>     >> From: Syahrul Sazli Shaharir <sazli@pulasan.my>
>     >> Sent: Monday, December 12, 2016 8:21 PM
>     >> To: users@cloudstack.apache.org
>     >> Subject: Router VM: patchviasocket.py timeout issue on 1 out of 4 networks
>     >>
>     >> Hi,
>     >>
>     >> I am running latest Cloudstack 4.9.0.1 on CentOS 7 KVM + ceph
>     >> environment. After running for some time, I faced with an issue with
>     >> one out of 4 networks - following a heartbeat-induced reset on all
>     >> hosts, the associated virtual router would not get recreated and
>     >> started properly on any of the 3 hosts I have, even after repeated
>     >> attempts of the following:-
>     >> - destroy-recreate cycles, via Cloudstack UI
>     >> - restartNetwork cleanup=true API calls (failed with errorcode = 530).
>     >> - redownload and reregister system VM template as another entry and
>     >> assign to router VM in global setting (boots the new template OK, but
>     >> still same problem)
>     >> - tweak default system offering for router VM (increased RAM from 256 to
512MB)
>     >> - created new system offering, with RAM tweak, and use of ceph rbd
>     >> store, and assigned it to Cloud.Com-SoftwareRouter as per docs - which
>     >> didnt work for some reason: it kept on using initial default offering
>     >> and created image on local host storage
>     >> - upgrade to latest cloudstack (previously was running 4.8)
>     >>
>     >> As with a handful of others in this list archives, virsh list and
>     >> dumpxml shows the VM created OK but failed soon after booting, as
>     >> found in the following error in agent.log :-
>     >>
>     >> 2016-12-13 10:03:33,894 WARN  [kvm.resource.LibvirtComputingResource]
>     >> (agentRequest-Handler-1:null) (logid:633e6e03) Timed out:
>     >> /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.py
>     >> -n r-668-VM -p %template=domP%name=r-668-VM%eth0ip=10.3.28.10%eth0mask=255.255.255.0%gateway=10.3.28.1%domain=nocser.net%cidrsize=24%dhcprange=10.3.28.1%eth1ip=169.254.0.33%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=8.8.8.8%dns2=8.8.4.4%ip6dns1=%ip6dns2=%baremetalnotificationsecuritykey=uavJByNGGjNLrELG-qbdN99__1I3tnp8qa0KbcsKokKJcPB43K9s6oQu2nMLqo3YP8p6jqDy5XT3WWOWBA2yNw%baremetalnotificationapikey=8JH4mdkxsEMhgIBgMonkNXAEKjVOeZnG1m5UVekvvo4v_iXQ4ZS7rh6NNS0qphhc7ZrCauiz23tp2-Wa3AASlg%host=10.2.30.11%port=8080
>     >> .  Output is:
>     >> .....
>     >> 2016-12-13 10:05:45,895 WARN  [kvm.resource.LibvirtComputingResource]
>     >> (agentRequest-Handler-1:null) (logid:633e6e03) Timed out:
>     >> /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh
>     >> vr_cfg.sh 169.254.0.33 -c
>     >> /var/cache/cloud/VR-48ea8a95-6c02-499f-88d3-eae5bf9f9fbe.cfg .  Output
>     >> is:
>     >>
>     >> As mentioned, this only happens with 1 network (always the same
>     >> network). The other router VMs work OK. Any clues on how to
>     >> troubleshoot this further, would be greatly appreciated.
>     >>
>     >> Thanks.
>     >>
>     >> --
>     >> --sazli
>     >> Syahrul Sazli Shaharir <sazli@pulasan.my>
>     >>
>
>
>
>     --
>     --sazli
>     Syahrul Sazli Shaharir <sazli@pulasan.my>
>     Mobile: +6019 385 8301 - YM/Skype: syahrulsazli
>     System Administrator
>     TMK Pulasan (002339810-M) http://pulasan.my/
>     11 Jalan 3/4, 43650 Bandar Baru Bangi, Selangor, Malaysia.
>     Tel/Fax: +603 8926 0338
>
>
>
> Dag.Sonstebo@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>



-- 
--sazli

Mime
View raw message