cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Linas Žilinskas <li...@host1plus.com>
Subject Re: patchviasocket seems to be broken with qemu 2.3(+?)
Date Tue, 20 Dec 2016 08:23:42 GMT
I don't think the issue is the same. As i mentioned in the original 
report and my findings afterwards, this is a specifically qemu issue 
which was fixed in 2.4.0-rc3.

The issue was the way qemu exposes the socket to communicate with VM. It 
didn't queue data, so unless the VM was listening on /dev/vport.. at the 
time when data is sent, it would never reiceive it. 2.4.0-rc3 fixed this 
by queueing the data sent, so once sent, it was accessible (only once) 
when the VM checked /dev/vport..


On 19/12/16 10:37, Wei ZHOU wrote:
> Hi Linas,
>
> It seems the issue you mentioned has been fixed by the commits for 
> https://issues.apache.org/jira/browse/CLOUDSTACK-2823
>
> CloudStack-agent will try to pass the boot args 30 times if the 
> console Ip is not accessible.
>
> Weird.
>
> -Wei
>
> 2016-12-19 10:03 GMT+01:00 Linas Žilinskas <linas@host1plus.com 
> <mailto:linas@host1plus.com>>:
>
>     From the logs it doesn't seem that the script timeouts. "Execution
>     is successful", so it manages to pass the data over the socket.
>
>     I guess the systemvm just doesn't configure itself for some reason.
>
>     Also, in my personal tests, I noticed some different behaviour
>     with different kernels. Don't remember the specifics right now,
>     but on some combinations (qemu / kernel) the socket acted
>     differently. For example the data was sent over the socket, but
>     wasn't visible inside the VM. Other times the socket would be
>     stuck from the host side.
>
>     So i would suggest testing different kernels (3.x, 4.4.x, 4.8.x)
>     or try to login to the system vm and see what's happening from inside.
>
>
>     On 12/16/16 03:46, Syahrul Sazli Shaharir wrote:
>>     On 2016-12-16 11:27, Syahrul Sazli Shaharir wrote:
>>>     On Wed, 26 Oct 2016, Linas ?ilinskas wrote:
>>>
>>>>     So after some investigation I've found out that qemu 2.3.0 is
>>>>     indeed broken, at least the way CS uses the qemu chardev/socket.
>>>>
>>>>     Not sure in which specific version it happened, but it was
>>>>     fixed in 2.4.0-rc3, specifically noting that CloudStack 4.2 was
>>>>     not working.
>>>>
>>>>     qemu git commit: 4bf1cb03fbc43b0055af60d4ff093d6894aa4338
>>>>
>>>>     Also attaching the patch from that commit.
>>>>
>>>>
>>>>     For our own purposes i've included the patch to the qemu-kvm-ev
>>>>     package (2.3.0) and all is well.
>>>
>>>     Hi,
>>>
>>>     I am facing the exact same issue on latest Cloudstack 4.9.0.1, on
>>>     latest CentOS 7.3.1611, with latest qemu-kvm-ev-2.6.0-27.1.el7
>>>     package.
>>>
>>>     The issue initially surfaced following a heartbeat-induced reset of
>>>     all hosts, when it was on CS 4.8 @ CentOS 7.0 and stock
>>>     qemu-kvm-1.5.3. Since then, the patchviasocket.pl/py
>>>     <http://patchviasocket.pl/py> timeouts
>>>     persisted for 1 out of 4 router VM/networks, even after
>>>     upgrading to
>>>     latest code. (I have checked the qemu-kvm-ev-2.6.0-27.1.el7 source,
>>>     and the patched code are pretty much still intact, as per the
>>>     2.4.0-rc3 commit).
>>>
>>>     Any help would be greatly appreciated.
>>>
>>>     Thanks.
>>>
>>>     (Attached are some debug logs from the host's agent.log)
>>
>>     Here are the debug logs as mentioned: http://pastebin.com/yHdsMNzZ
>>
>>     Thanks.
>>
>>>
>>>     --sazli
>>>
>>>>
>>>>
>>>>     On 2016-10-20 09:59, Linas ?ilinskas wrote:
>>>>>
>>>>>      Hi.
>>>>>
>>>>>      We have made an upgrade to 4.9.
>>>>>
>>>>>      Custom build packages with our own patches, which in my mind
>>>>>     (i'm the only
>>>>>      one patching those) should not affect the issue i'll describe.
>>>>>
>>>>>      I'm not sure whether we didn't notice it before, or it's
>>>>>     actually related
>>>>>      to something in 4.9
>>>>>
>>>>>      Basically our system vm's were unable to be patched via the
>>>>>     qemu socket.
>>>>>      The script simply error'ed out with a timeout while trying to
>>>>>     push the
>>>>>      data to the socket.
>>>>>
>>>>>      Executing it manually (with cmd line from the logs) resulted
>>>>>     the same. I
>>>>>      even tried the old perl variant, which also had same result.
>>>>>
>>>>>      So finally we found out that this issue happens only on our
>>>>>     HVs which run
>>>>>      qemu 2.3.0, from the centos 7 special interest virtualization
>>>>>     repo. Other
>>>>>      ones that run qemu 1.5, from official repos, can patch the
>>>>>     system vms
>>>>>      fine.
>>>>>
>>>>>      So i'm wondering if anyone tested 4.9 with kvm with qemu >=
>>>>>     2.x? Maybe it
>>>>>      something else special in our setup. e.g. we're running the
>>>>>     HVs from a
>>>>>      preconfigured netboot image (pxe), but all of them, including
>>>>>     those with
>>>>>      qemu 1.5, so i have no idea.
>>>>>
>>>>>
>>>>>      Linas ?ilinskas
>>>>>      Head of Development
>>>>>      website <http://www.host1plus.com/>
>>>>>     <http://www.host1plus.com/> facebook
>>>>>     <https://www.facebook.com/Host1Plus>
>>>>>     <https://www.facebook.com/Host1Plus> twitter
>>>>>     <https://twitter.com/Host1Plus>
>>>>>     <https://twitter.com/Host1Plus> linkedin
>>>>>     <https://www.linkedin.com/company/digital-energy-technologies-ltd.>
>>>>>     <https://www.linkedin.com/company/digital-energy-technologies-ltd.>
>>>>>
>>>>>
>>>>>      Host1Plus is a division of Digital Energy Technologies Ltd.
>>>>>
>>>>>      26 York Street, London W1U 6PZ, United Kingdom
>>>>>
>>>>
>>>>     Linas ?ilinskas
>>>>     Head of Development
>>>>     website <http://www.host1plus.com/> <http://www.host1plus.com/>
>>>>     facebook <https://www.facebook.com/Host1Plus>
>>>>     <https://www.facebook.com/Host1Plus> twitter
>>>>     <https://twitter.com/Host1Plus> <https://twitter.com/Host1Plus>
>>>>     linkedin
>>>>     <https://www.linkedin.com/company/digital-energy-technologies-ltd.>
>>>>     <https://www.linkedin.com/company/digital-energy-technologies-ltd.>
>>>>
>>>>     Host1Plus is a division of Digital Energy Technologies Ltd.
>>>>
>>>>     26 York Street, London W1U 6PZ, United Kingdom
>>>>
>>>>
>>>>
>>
>
>     Linas Žilinskas
>     Head of Development
>     website <http://www.host1plus.com/> facebook
>     <https://www.facebook.com/Host1Plus> twitter
>     <https://twitter.com/Host1Plus> linkedin
>     <https://www.linkedin.com/company/digital-energy-technologies-ltd.>
>
>     Host1Plus is a division of Digital Energy Technologies Ltd.
>
>     26 York Street, London W1U 6PZ, United Kingdom
>
>

Linas Žilinskas
Head of Development
website <http://www.host1plus.com/> facebook 
<https://www.facebook.com/Host1Plus> twitter 
<https://twitter.com/Host1Plus> linkedin 
<https://www.linkedin.com/company/digital-energy-technologies-ltd.>

Host1Plus is a division of Digital Energy Technologies Ltd.

26 York Street, London W1U 6PZ, United Kingdom


Mime
View raw message