cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wei ZHOU <ustcweiz...@gmail.com>
Subject Re: Apache CloudStack 4.10 VR/BasicZone/KVM Problem
Date Mon, 13 Nov 2017 09:18:12 GMT
Hi Ivan,

I would suggest you to create jira tickets for each problem you found in
your testing, and create a github pull request for a jira ticket.
It is convenient for reviewers.

Kind regards,
Wei

2017-11-13 10:01 GMT+01:00 Ivan Kudryavtsev <kudryavtsev_ia@bw-sw.com>:

> Hello, Ozhan
>
> https://github.com/apache/cloudstack/pull/2320
>
> fixes everything I found right now. It enables functioning of everything
> correctly even if no IPv6 CIDR specified for network (at least for Ubuntu
> 14.04).
> For IPv6 configuration instruction please take a look at:
> https://github.com/apache/cloudstack/commit/f10c8bfe0c99a762c2606459413a47
> 219614e775
> (oh my god,I spend several hours trying to find how to configure IPv6 for
> 4.10).
>
> Please, don't forget to recreate SSVM because there is a fix for templates
> too:
> https://github.com/apache/cloudstack/pull/2322
>
>
> 2017-11-13 15:51 GMT+07:00 Özhan Rüzgar Karaman <oruzgarkaraman@gmail.com
> >:
>
> > Hi Ivan;
> > Does this hotfixes also solve qoutes and shell script interprets problem?
> > We have no ipv6 setup and today we made similar test with fresh install
> > 4.10. We noticed that we receive similar error on security groups stage
> > while br_netfilter module is already active on our environment. We made
> > same tests for Ubuntu 16.04.3 and 14.04.5 kvm hosts
> >
> > Logs are below:
> > 2017-11-13 11:47:41,773 DEBUG [kvm.resource.LibvirtComputingResource]
> > (agentRequest-Handler-1:null) Executing:
> > /usr/share/cloudstack-common/scripts/vm/network/security_group.py
> > add_network_rules --vmname i-2-5-VM --vmid 5 --vmip 192.168.18.6 --vmip6
> > null --sig 74a6d8c403af9c3c7b89ecf206e4ac26 --seq 16 --vmmac
> > 1e:00:9b:00:00:05 --vif vnet8 --brname breth0-23 --nicsecips 0: --rules
> > I:tcp:1:65535:
> > 0.0.0.0/0,NEXT;I:udp:1:65535:0.0.0.0/0,NEXT;E:tcp:1:65535:0.0.0.0/0,NEXT
> ;
> > 2017-11-13 11:47:41,773 WARN  [kvm.resource.LibvirtComputingResource]
> > (agentRequest-Handler-1:null) Exception:
> > /usr/share/cloudstack-common/scripts/vm/network/security_group.py
> > add_network_rules --vmname i-2-5-VM --vmid 5 --vmip 192.168.18.6 --vmip6
> > null --sig 74a6d8c403af9c3c7b89ecf206e4ac26 --seq 16 --vmmac
> > 1e:00:9b:00:00:05 --vif vnet8 --brname breth0-23 --nicsecips 0: --rules
> > I:tcp:1:65535:
> > 0.0.0.0/0,NEXT;I:udp:1:65535:0.0.0.0/0,NEXT;E:tcp:1:65535:0.0.0.0/0,NEXT
> ;
> > java.lang.NullPointerException
> > at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
> > at com.cloud.utils.script.Script.execute(Script.java:214)
> > at com.cloud.utils.script.Script.execute(Script.java:182)
> > at
> > com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.
> > addNetworkRules(LibvirtComputingResource.java:3429)
> > at
> > com.cloud.hypervisor.kvm.resource.wrapper.LibvirtSecurityGroupRulesComma
> > ndWrapper.execute(LibvirtSecurityGroupRulesCommandWrapper.java:57)
> > at
> > com.cloud.hypervisor.kvm.resource.wrapper.LibvirtSecurityGroupRulesComma
> > ndWrapper.execute(LibvirtSecurityGroupRulesCommandWrapper.java:36)
> > at
> > com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRequestWrapper.execute(
> > LibvirtRequestWrapper.java:75)
> > at
> > com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.
> executeRequest(
> > LibvirtComputingResource.java:1369)
> > at com.cloud.agent.Agent.processRequest(Agent.java:525)
> > at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:833)
> > at com.cloud.utils.nio.Task.call(Task.java:83)
> > at com.cloud.utils.nio.Task.call(Task.java:29)
> > at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> > at
> > java.util.concurrent.ThreadPoolExecutor.runWorker(
> > ThreadPoolExecutor.java:1149)
> > at
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > ThreadPoolExecutor.java:624)
> > at java.lang.Thread.run(Thread.java:748)
> > 2017-11-13 11:47:41,774 WARN
> > [resource.wrapper.LibvirtSecurityGroupRulesCommandWrapper]
> > (agentRequest-Handler-1:null) Failed to program network rules for vm
> > i-2-5-VM
> > 2017-11-13 11:47:41,775 DEBUG [cloud.agent.Agent]
> > (agentRequest-Handler-1:null) Seq 1-6412562919422165093:  { Ans: ,
> MgmtId:
> > 345048635880, via: 1, Ver: v1, Flags: 110,
> > [{"com.cloud.agent.api.SecurityGroupRuleAnswer":{"
> > logSequenceNumber":16,"vmId":5,"reason":"PROGRAMMING_
> > FAILED","result":false,"details":"programming
> > network rules failed","wait":0}}] }
> >
> >
> > When we execute command with double quotas for rules section from command
> > line it executes without a problem like below:
> > root@kvmt3:/var/log/cloudstack/agent#
> > /usr/share/cloudstack-common/scripts/vm/network/security_group.py
> > add_network_rules --vmname i-2-5-VM --vmid 5 --vmip 192.168.18.6 --vmip6
> > null --sig 74a6d8c403af9c3c7b89ecf206e4ac26 --seq 16 --vmmac
> > 1e:00:9b:00:00:05 --vif vnet8 --brname breth0-23 --nicsecips 0: --rules
> > "I:tcp:1:65535:
> > 0.0.0.0/0,NEXT;I:udp:1:65535:0.0.0.0/0,NEXT;E:tcp:1:65535:0.0.0.0/0,NEXT
> ;"
> > root@kvmt3:/var/log/cloudstack/agent# echo $?
> > 0
> > root@kvmt3:/var/log/cloudstack/agent#
> >
> > Thanks
> > Özhan
> >
> >
> > On Sat, Nov 11, 2017 at 6:59 PM, Ivan Kudryavtsev <
> > kudryavtsev_ia@bw-sw.com>
> > wrote:
> >
> > > Hello, I implemented some hotfixes for 4.10 to work
> > >
> > > https://github.com/apache/cloudstack/pull/2319 - to master (load
> > > br_netfilter module)
> > > https://github.com/apache/cloudstack/pull/2320 - to 4.10 which fixes
> SG
> > > failures related to ipv6.
> > >
> > >
> > > 2017-11-11 15:51 GMT+07:00 Ivan Kudryavtsev <kudryavtsev_ia@bw-sw.com
> >:
> > >
> > > > Following up with previous question. I managed to make it work by
> > > removing
> > > > all and heading to ubuntu 14.04 hypervisor host.
> > > >
> > > > Also, what I found more:
> > > >
> > > > 1. when setup databases (management server) if custom port is
> > specified,
> > > > databases themself is not created. If create manually, import scripts
> > > work
> > > > fine.
> > > > 2. UI: unable to download ISO to __all__ zones. Have to specify
> certain
> > > > zone, else UI gives an error.
> > > > 3. Ubuntu doesn't load module *br_netfilter* but
> > > >
> > > > /usr/share/cloudstack-common/scripts/vm/network/security_group.py
> > > >
> > > > uses it and nothing good as a result:
> > > >
> > > > 2017-11-11 15:38:29,241 - sysctl -w net.bridge.bridge-nf-call-
> > > arptables=1
> > > > 2017-11-11 15:38:29,244 - sysctl -w net.bridge.bridge-nf-call-
> > iptables=1
> > > > 2017-11-11 15:38:29,247 - sysctl -w net.bridge.bridge-nf-call-
> > > ip6tables=1
> > > >
> > > > adding br_netfilter to /etc/modules fixes it. I suppose it's
> necessary
> > to
> > > > add
> > > >
> > > > in script something like: modprobe br_netfilter (or smarter thing).
> > > >
> > > > But It doesn't work completely, actually, security groups are
> > > unfunctional:
> > > >
> > > > ==> /var/log/cloudstack/agent/agent.log <==
> > > > 2017-11-11 15:40:41,326 WARN  [kvm.resource.
> LibvirtComputingResource]
> > > > (agentRequest-Handler-2:null) (logid:eab9a328) Exception:
> > > > /usr/share/cloudstack-common/scripts/vm/network/security_group.py
> > > > add_network_rules --vmname i-2-7-VM --vmid 7 --vmip 176.120.28.4
> > --vmip6
> > > > null --sig d60255deb618b7be9f477eed10d65234 --seq 4 --vmmac
> > > > 1e:00:6f:00:01:01 --vif vnet8 --brname cloudbr0 --nicsecips 0:
> --rules
> > > > I:icmp:-1:-1:0.0.0.0/0,NEXT;I:tcp:1:65535:0.0.0.0/0,NEXT;I:
> > > > udp:1:65535:0.0.0.0/0,NEXT;E:icmp:-1:-1:0.0.0.0/0,NEXT;E:
> > > > tcp:1:65535:0.0.0.0/0,NEXT;E:udp:1:65535:0.0.0.0/0,NEXT;
> > > > java.lang.NullPointerException
> > > > at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
> > > > at com.cloud.utils.script.Script.execute(Script.java:214)
> > > > at com.cloud.utils.script.Script.execute(Script.java:182)
> > > > at com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.
> > > > addNetworkRules(LibvirtComputingResource.java:3429)
> > > > at com.cloud.hypervisor.kvm.resource.wrapper.
> > > > LibvirtSecurityGroupRulesCommandWrapper.execute(
> > > > LibvirtSecurityGroupRulesCommandWrapper.java:57)
> > > > at com.cloud.hypervisor.kvm.resource.wrapper.
> > > > LibvirtSecurityGroupRulesCommandWrapper.execute(
> > > > LibvirtSecurityGroupRulesCommandWrapper.java:36)
> > > > at com.cloud.hypervisor.kvm.resource.wrapper.
> > > > LibvirtRequestWrapper.execute(LibvirtRequestWrapper.java:75)
> > > > at com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.
> > > > executeRequest(LibvirtComputingResource.java:1369)
> > > > at com.cloud.agent.Agent.processRequest(Agent.java:525)
> > > > at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:833)
> > > > at com.cloud.utils.nio.Task.call(Task.java:83)
> > > > at com.cloud.utils.nio.Task.call(Task.java:29)
> > > > at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> > > > at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > > > ThreadPoolExecutor.java:1149)
> > > > at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > > > ThreadPoolExecutor.java:624)
> > > > at java.lang.Thread.run(Thread.java:748)
> > > > 2017-11-11 15:40:41,327 WARN  [resource.wrapper.
> > > > LibvirtSecurityGroupRulesCommandWrapper]
> (agentRequest-Handler-2:null)
> > > > (logid:eab9a328) Failed to program network rules for vm i-2-7-VM
> > > >
> > > > So, no rules are actually created. Script doesn't call... I suppose
> may
> > > be
> > > > quotes are required because shell interprets ';' as command
> separator.
> > I
> > > > suppose that optimization introduced in 4.10, because in 4.9 SGs work
> > > like
> > > > a charm...
> > > >
> > > >
> > > > 2017-11-11 3:15 GMT+07:00 Paul Angus <paul.angus@shapeblue.com>:
> > > >
> > > >> Ivan,
> > > >>
> > > >> Can you paste a larger section of unfiltered logs.  There would
> always
> > > be
> > > >> a message explaining why the mgmt. server thought that a VR should
> be
> > > shut
> > > >> down
> > > >>
> > > >>
> > > >>
> > > >> Kind regards,
> > > >>
> > > >> Paul Angus
> > > >>
> > > >> paul.angus@shapeblue.com
> > > >> www.shapeblue.com
> > > >> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > >> @shapeblue
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> -----Original Message-----
> > > >> From: Simon Weller [mailto:sweller@ena.com.INVALID]
> > > >> Sent: 10 November 2017 18:39
> > > >> To: dev@cloudstack.apache.org
> > > >> Subject: Re: Apache CloudStack 4.10 VR/BasicZone/KVM Problem
> > > >>
> > > >> What VR template image are you using?
> > > >>
> > > >>
> > > >> ________________________________
> > > >> From: Ivan Kudryavtsev <kudryavtsev_ia@bw-sw.com>
> > > >> Sent: Friday, November 10, 2017 11:59 AM
> > > >> To: dev@cloudstack.apache.org
> > > >> Subject: Re: Apache CloudStack 4.10 VR/BasicZone/KVM Problem
> > > >>
> > > >> Hi. No, regular NFS. VR starts great, but stopped by ms, other
> system
> > > vms
> > > >> are working. I even added to communication script on compute node
> > "sleep
> > > >> 3600" before ssh, so response to management is delayed, I logged so
> to
> > > VR,
> > > >> all interfaces are up, iptables rules are OK.
> > > >>
> > > >> So agent rolls vr good, but stops it by management order with no
> > obvious
> > > >> reason.
> > > >>
> > > >> 11 нояб. 2017 г. 0:54 пользователь "Simon Weller"
> > > <sweller@ena.com.invalid
> > > >> >
> > > >> написал:
> > > >>
> > > >> > Is the storage ceph?
> > > >> >
> > > >> >
> > > >> > ________________________________
> > > >> > From: Ivan Kudryavtsev <kudryavtsev_ia@bw-sw.com>
> > > >> > Sent: Friday, November 10, 2017 11:52 AM
> > > >> > To: dev@cloudstack.apache.org
> > > >> > Subject: Re: Apache CloudStack 4.10 VR/BasicZone/KVM Problem
> > > >> >
> > > >> > Hi, I did, and it does the things right, I even added "tee" to
ssh
> > > >> > 3922 communication script to out vr response to additional log
and
> > it
> > > >> > only receives VR version line and sends all info (the same from
> > > >> > pastebin) to ACS and receives "stop" order.
> > > >> >
> > > >> > I'll try to provide additional info, but ad you can see,
> management
> > > >> > receives proper response and sends stop next op. It looks very
> > freaky
> > > >> > without any notification...
> > > >> >
> > > >> > 11 нояб. 2017 г. 0:37 пользователь "Simon Weller"
> > > >> > <sweller@ena.com.invalid
> > > >> > >
> > > >> > написал:
> > > >> >
> > > >> > > Ivan,
> > > >> > >
> > > >> > >
> > > >> > > Can you put the host agents into debug mode? Hopefully that
will
> > > >> > > provide more information.
> > > >> > >
> > > >> > >
> > > >> > > https://cwiki.apache.org/confluence/display/CLOUDSTACK/
> > > KVM+agent+deb
> > > >> > > ug
> > > >> KVM agent debug - Apache Cloudstack - Apache Software ...<
> > > >> https://cwiki.apache.org/confluence/display/CLOUDSTACK/
> > KVM+agent+debug>
> > > >> cwiki.apache.org
> > > >> Steps to debug the KVM agent from eclipse: In KVM agent edit
> > > >> '/usr/libexec/agent-runner ', add "-Xrunjdwp:transport=dt_
> > > socket,address=8787
> > > >> ...
> > > >>
> > > >>
> > > >>
> > > >> > >
> > > >> > >
> > > >> > > - Si
> > > >> > >
> > > >> > > ________________________________
> > > >> > > From: Ivan Kudryavtsev <kudryavtsev_ia@bw-sw.com>
> > > >> > > Sent: Friday, November 10, 2017 11:34 AM
> > > >> > > To: dev@cloudstack.apache.org
> > > >> > > Subject: Apache CloudStack 4.10 VR/BasicZone/KVM Problem
> > > >> > >
> > > >> > > Hello, Devs.
> > > >> > >
> > > >> > > I experience VR Start Problem in the fresh ACS 4.10 deployment
> > > >> > >
> > > >> > > Intersting place of logs is here: https://pastebin.com/iBXRBA5N
> > > >> [https://pastebin.com/i/facebook.png]<https://pastebin.com/iBXRBA5N
> >
> > > >>
> > > >> 2017-11-10 23:05:35,853 DEBUG [c.c.a.t.Request]
> > > >> (Work-Job-Executor-15:ctx-6fdf61 - Pastebin.com<https://pastebin.
> > > >> com/iBXRBA5N>
> > > >> pastebin.com
> > > >>
> > > >>
> > > >>
> > > >> > >
> > > >> > > Basically, the situation looks like:
> > > >> > >
> > > >> > > 1. Management Server tries to launch VR 2. It gets from
Agent
> > proper
> > > >> > > VR response with VR details 3. It sends StopCommand without
> > > >> > > explanation.
> > > >> > >
> > > >> > > I'm trying to figure out what happens inside, but the codebase
> is
> > > >> > > huge
> > > >> > and
> > > >> > > still no positive results. Please, let me know if you have
any
> > ideas
> > > >> > which
> > > >> > > could help me finding the reason. Thanks a lot.
> > > >> > >
> > > >> > > --
> > > >> > > With best regards, Ivan Kudryavtsev
> > > >> > > Bitworks Software, Ltd.
> > > >> > > Cell: +7-923-414-1515
> > > >> > > WWW: http://bitworks.software/ <http://bw-sw.com/>
> > > >> > >
> > > >> >
> > > >>
> > > >>
> > > >
> > > >
> > > > --
> > > > With best regards, Ivan Kudryavtsev
> > > > Bitworks Software, Ltd.
> > > > Cell: +7-923-414-1515
> > > > WWW: http://bitworks.software/ <http://bw-sw.com/>
> > > >
> > > >
> > >
> > >
> > > --
> > > With best regards, Ivan Kudryavtsev
> > > Bitworks Software, Ltd.
> > > Cell: +7-923-414-1515
> > > WWW: http://bitworks.software/ <http://bw-sw.com/>
> > >
> >
>
>
>
> --
> With best regards, Ivan Kudryavtsev
> Bitworks Software, Ltd.
> Cell: +7-923-414-1515
> WWW: http://bitworks.software/ <http://bw-sw.com/>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message