cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Özhan Rüzgar Karaman <oruzgarkara...@gmail.com>
Subject Re: Apache CloudStack 4.10 VR/BasicZone/KVM Problem
Date Mon, 13 Nov 2017 08:51:58 GMT
Hi Ivan;
Does this hotfixes also solve qoutes and shell script interprets problem?
We have no ipv6 setup and today we made similar test with fresh install
4.10. We noticed that we receive similar error on security groups stage
while br_netfilter module is already active on our environment. We made
same tests for Ubuntu 16.04.3 and 14.04.5 kvm hosts

Logs are below:
2017-11-13 11:47:41,773 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-1:null) Executing:
/usr/share/cloudstack-common/scripts/vm/network/security_group.py
add_network_rules --vmname i-2-5-VM --vmid 5 --vmip 192.168.18.6 --vmip6
null --sig 74a6d8c403af9c3c7b89ecf206e4ac26 --seq 16 --vmmac
1e:00:9b:00:00:05 --vif vnet8 --brname breth0-23 --nicsecips 0: --rules
I:tcp:1:65535:
0.0.0.0/0,NEXT;I:udp:1:65535:0.0.0.0/0,NEXT;E:tcp:1:65535:0.0.0.0/0,NEXT;
2017-11-13 11:47:41,773 WARN  [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-1:null) Exception:
/usr/share/cloudstack-common/scripts/vm/network/security_group.py
add_network_rules --vmname i-2-5-VM --vmid 5 --vmip 192.168.18.6 --vmip6
null --sig 74a6d8c403af9c3c7b89ecf206e4ac26 --seq 16 --vmmac
1e:00:9b:00:00:05 --vif vnet8 --brname breth0-23 --nicsecips 0: --rules
I:tcp:1:65535:
0.0.0.0/0,NEXT;I:udp:1:65535:0.0.0.0/0,NEXT;E:tcp:1:65535:0.0.0.0/0,NEXT;
java.lang.NullPointerException
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at com.cloud.utils.script.Script.execute(Script.java:214)
at com.cloud.utils.script.Script.execute(Script.java:182)
at
com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.addNetworkRules(LibvirtComputingResource.java:3429)
at
com.cloud.hypervisor.kvm.resource.wrapper.LibvirtSecurityGroupRulesCommandWrapper.execute(LibvirtSecurityGroupRulesCommandWrapper.java:57)
at
com.cloud.hypervisor.kvm.resource.wrapper.LibvirtSecurityGroupRulesCommandWrapper.execute(LibvirtSecurityGroupRulesCommandWrapper.java:36)
at
com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRequestWrapper.execute(LibvirtRequestWrapper.java:75)
at
com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1369)
at com.cloud.agent.Agent.processRequest(Agent.java:525)
at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:833)
at com.cloud.utils.nio.Task.call(Task.java:83)
at com.cloud.utils.nio.Task.call(Task.java:29)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2017-11-13 11:47:41,774 WARN
[resource.wrapper.LibvirtSecurityGroupRulesCommandWrapper]
(agentRequest-Handler-1:null) Failed to program network rules for vm
i-2-5-VM
2017-11-13 11:47:41,775 DEBUG [cloud.agent.Agent]
(agentRequest-Handler-1:null) Seq 1-6412562919422165093:  { Ans: , MgmtId:
345048635880, via: 1, Ver: v1, Flags: 110,
[{"com.cloud.agent.api.SecurityGroupRuleAnswer":{"logSequenceNumber":16,"vmId":5,"reason":"PROGRAMMING_FAILED","result":false,"details":"programming
network rules failed","wait":0}}] }


When we execute command with double quotas for rules section from command
line it executes without a problem like below:
root@kvmt3:/var/log/cloudstack/agent#
/usr/share/cloudstack-common/scripts/vm/network/security_group.py
add_network_rules --vmname i-2-5-VM --vmid 5 --vmip 192.168.18.6 --vmip6
null --sig 74a6d8c403af9c3c7b89ecf206e4ac26 --seq 16 --vmmac
1e:00:9b:00:00:05 --vif vnet8 --brname breth0-23 --nicsecips 0: --rules
"I:tcp:1:65535:
0.0.0.0/0,NEXT;I:udp:1:65535:0.0.0.0/0,NEXT;E:tcp:1:65535:0.0.0.0/0,NEXT;"
root@kvmt3:/var/log/cloudstack/agent# echo $?
0
root@kvmt3:/var/log/cloudstack/agent#

Thanks
Özhan


On Sat, Nov 11, 2017 at 6:59 PM, Ivan Kudryavtsev <kudryavtsev_ia@bw-sw.com>
wrote:

> Hello, I implemented some hotfixes for 4.10 to work
>
> https://github.com/apache/cloudstack/pull/2319 - to master (load
> br_netfilter module)
> https://github.com/apache/cloudstack/pull/2320 - to 4.10 which fixes SG
> failures related to ipv6.
>
>
> 2017-11-11 15:51 GMT+07:00 Ivan Kudryavtsev <kudryavtsev_ia@bw-sw.com>:
>
> > Following up with previous question. I managed to make it work by
> removing
> > all and heading to ubuntu 14.04 hypervisor host.
> >
> > Also, what I found more:
> >
> > 1. when setup databases (management server) if custom port is specified,
> > databases themself is not created. If create manually, import scripts
> work
> > fine.
> > 2. UI: unable to download ISO to __all__ zones. Have to specify certain
> > zone, else UI gives an error.
> > 3. Ubuntu doesn't load module *br_netfilter* but
> >
> > /usr/share/cloudstack-common/scripts/vm/network/security_group.py
> >
> > uses it and nothing good as a result:
> >
> > 2017-11-11 15:38:29,241 - sysctl -w net.bridge.bridge-nf-call-
> arptables=1
> > 2017-11-11 15:38:29,244 - sysctl -w net.bridge.bridge-nf-call-iptables=1
> > 2017-11-11 15:38:29,247 - sysctl -w net.bridge.bridge-nf-call-
> ip6tables=1
> >
> > adding br_netfilter to /etc/modules fixes it. I suppose it's necessary to
> > add
> >
> > in script something like: modprobe br_netfilter (or smarter thing).
> >
> > But It doesn't work completely, actually, security groups are
> unfunctional:
> >
> > ==> /var/log/cloudstack/agent/agent.log <==
> > 2017-11-11 15:40:41,326 WARN  [kvm.resource.LibvirtComputingResource]
> > (agentRequest-Handler-2:null) (logid:eab9a328) Exception:
> > /usr/share/cloudstack-common/scripts/vm/network/security_group.py
> > add_network_rules --vmname i-2-7-VM --vmid 7 --vmip 176.120.28.4 --vmip6
> > null --sig d60255deb618b7be9f477eed10d65234 --seq 4 --vmmac
> > 1e:00:6f:00:01:01 --vif vnet8 --brname cloudbr0 --nicsecips 0: --rules
> > I:icmp:-1:-1:0.0.0.0/0,NEXT;I:tcp:1:65535:0.0.0.0/0,NEXT;I:
> > udp:1:65535:0.0.0.0/0,NEXT;E:icmp:-1:-1:0.0.0.0/0,NEXT;E:
> > tcp:1:65535:0.0.0.0/0,NEXT;E:udp:1:65535:0.0.0.0/0,NEXT;
> > java.lang.NullPointerException
> > at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
> > at com.cloud.utils.script.Script.execute(Script.java:214)
> > at com.cloud.utils.script.Script.execute(Script.java:182)
> > at com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.
> > addNetworkRules(LibvirtComputingResource.java:3429)
> > at com.cloud.hypervisor.kvm.resource.wrapper.
> > LibvirtSecurityGroupRulesCommandWrapper.execute(
> > LibvirtSecurityGroupRulesCommandWrapper.java:57)
> > at com.cloud.hypervisor.kvm.resource.wrapper.
> > LibvirtSecurityGroupRulesCommandWrapper.execute(
> > LibvirtSecurityGroupRulesCommandWrapper.java:36)
> > at com.cloud.hypervisor.kvm.resource.wrapper.
> > LibvirtRequestWrapper.execute(LibvirtRequestWrapper.java:75)
> > at com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.
> > executeRequest(LibvirtComputingResource.java:1369)
> > at com.cloud.agent.Agent.processRequest(Agent.java:525)
> > at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:833)
> > at com.cloud.utils.nio.Task.call(Task.java:83)
> > at com.cloud.utils.nio.Task.call(Task.java:29)
> > at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> > at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > ThreadPoolExecutor.java:1149)
> > at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > ThreadPoolExecutor.java:624)
> > at java.lang.Thread.run(Thread.java:748)
> > 2017-11-11 15:40:41,327 WARN  [resource.wrapper.
> > LibvirtSecurityGroupRulesCommandWrapper] (agentRequest-Handler-2:null)
> > (logid:eab9a328) Failed to program network rules for vm i-2-7-VM
> >
> > So, no rules are actually created. Script doesn't call... I suppose may
> be
> > quotes are required because shell interprets ';' as command separator. I
> > suppose that optimization introduced in 4.10, because in 4.9 SGs work
> like
> > a charm...
> >
> >
> > 2017-11-11 3:15 GMT+07:00 Paul Angus <paul.angus@shapeblue.com>:
> >
> >> Ivan,
> >>
> >> Can you paste a larger section of unfiltered logs.  There would always
> be
> >> a message explaining why the mgmt. server thought that a VR should be
> shut
> >> down
> >>
> >>
> >>
> >> Kind regards,
> >>
> >> Paul Angus
> >>
> >> paul.angus@shapeblue.com
> >> www.shapeblue.com
> >> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> >> @shapeblue
> >>
> >>
> >>
> >>
> >> -----Original Message-----
> >> From: Simon Weller [mailto:sweller@ena.com.INVALID]
> >> Sent: 10 November 2017 18:39
> >> To: dev@cloudstack.apache.org
> >> Subject: Re: Apache CloudStack 4.10 VR/BasicZone/KVM Problem
> >>
> >> What VR template image are you using?
> >>
> >>
> >> ________________________________
> >> From: Ivan Kudryavtsev <kudryavtsev_ia@bw-sw.com>
> >> Sent: Friday, November 10, 2017 11:59 AM
> >> To: dev@cloudstack.apache.org
> >> Subject: Re: Apache CloudStack 4.10 VR/BasicZone/KVM Problem
> >>
> >> Hi. No, regular NFS. VR starts great, but stopped by ms, other system
> vms
> >> are working. I even added to communication script on compute node "sleep
> >> 3600" before ssh, so response to management is delayed, I logged so to
> VR,
> >> all interfaces are up, iptables rules are OK.
> >>
> >> So agent rolls vr good, but stops it by management order with no obvious
> >> reason.
> >>
> >> 11 нояб. 2017 г. 0:54 пользователь "Simon Weller"
> <sweller@ena.com.invalid
> >> >
> >> написал:
> >>
> >> > Is the storage ceph?
> >> >
> >> >
> >> > ________________________________
> >> > From: Ivan Kudryavtsev <kudryavtsev_ia@bw-sw.com>
> >> > Sent: Friday, November 10, 2017 11:52 AM
> >> > To: dev@cloudstack.apache.org
> >> > Subject: Re: Apache CloudStack 4.10 VR/BasicZone/KVM Problem
> >> >
> >> > Hi, I did, and it does the things right, I even added "tee" to ssh
> >> > 3922 communication script to out vr response to additional log and it
> >> > only receives VR version line and sends all info (the same from
> >> > pastebin) to ACS and receives "stop" order.
> >> >
> >> > I'll try to provide additional info, but ad you can see, management
> >> > receives proper response and sends stop next op. It looks very freaky
> >> > without any notification...
> >> >
> >> > 11 нояб. 2017 г. 0:37 пользователь "Simon Weller"
> >> > <sweller@ena.com.invalid
> >> > >
> >> > написал:
> >> >
> >> > > Ivan,
> >> > >
> >> > >
> >> > > Can you put the host agents into debug mode? Hopefully that will
> >> > > provide more information.
> >> > >
> >> > >
> >> > > https://cwiki.apache.org/confluence/display/CLOUDSTACK/
> KVM+agent+deb
> >> > > ug
> >> KVM agent debug - Apache Cloudstack - Apache Software ...<
> >> https://cwiki.apache.org/confluence/display/CLOUDSTACK/KVM+agent+debug>
> >> cwiki.apache.org
> >> Steps to debug the KVM agent from eclipse: In KVM agent edit
> >> '/usr/libexec/agent-runner ', add "-Xrunjdwp:transport=dt_
> socket,address=8787
> >> ...
> >>
> >>
> >>
> >> > >
> >> > >
> >> > > - Si
> >> > >
> >> > > ________________________________
> >> > > From: Ivan Kudryavtsev <kudryavtsev_ia@bw-sw.com>
> >> > > Sent: Friday, November 10, 2017 11:34 AM
> >> > > To: dev@cloudstack.apache.org
> >> > > Subject: Apache CloudStack 4.10 VR/BasicZone/KVM Problem
> >> > >
> >> > > Hello, Devs.
> >> > >
> >> > > I experience VR Start Problem in the fresh ACS 4.10 deployment
> >> > >
> >> > > Intersting place of logs is here: https://pastebin.com/iBXRBA5N
> >> [https://pastebin.com/i/facebook.png]<https://pastebin.com/iBXRBA5N>
> >>
> >> 2017-11-10 23:05:35,853 DEBUG [c.c.a.t.Request]
> >> (Work-Job-Executor-15:ctx-6fdf61 - Pastebin.com<https://pastebin.
> >> com/iBXRBA5N>
> >> pastebin.com
> >>
> >>
> >>
> >> > >
> >> > > Basically, the situation looks like:
> >> > >
> >> > > 1. Management Server tries to launch VR 2. It gets from Agent proper
> >> > > VR response with VR details 3. It sends StopCommand without
> >> > > explanation.
> >> > >
> >> > > I'm trying to figure out what happens inside, but the codebase is
> >> > > huge
> >> > and
> >> > > still no positive results. Please, let me know if you have any ideas
> >> > which
> >> > > could help me finding the reason. Thanks a lot.
> >> > >
> >> > > --
> >> > > With best regards, Ivan Kudryavtsev
> >> > > Bitworks Software, Ltd.
> >> > > Cell: +7-923-414-1515
> >> > > WWW: http://bitworks.software/ <http://bw-sw.com/>
> >> > >
> >> >
> >>
> >>
> >
> >
> > --
> > With best regards, Ivan Kudryavtsev
> > Bitworks Software, Ltd.
> > Cell: +7-923-414-1515
> > WWW: http://bitworks.software/ <http://bw-sw.com/>
> >
> >
>
>
> --
> With best regards, Ivan Kudryavtsev
> Bitworks Software, Ltd.
> Cell: +7-923-414-1515
> WWW: http://bitworks.software/ <http://bw-sw.com/>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message