cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From benoit lair <kurushi4...@gmail.com>
Subject Re: Production recovery procedure for VPC Virtual router into advanced zone - CS 4.0.0
Date Thu, 14 Feb 2013 09:45:06 GMT
Hello Ahmad,

I've tried so to restart VPC (made call with the UI dashboard), so my vpc
vr has been recreated, i've seen onto management-server.log the request for
recreating my vpc which is :

2013-02-14 10:17:01,008 DEBUG [agent.transport.Request]
(DirectAgent-289:null) Seq 14-1794712962: Executing:  { Cmd , MgmtId:
165013918770818, via: 14, Ver: v1, Flags: 100111,
[{"StartCommand":{"vm":{"id":18183,"name":"r-18183-VM","bootloader":"PyGrub","type":"DomainRouter","cpus":1,"speed":500,"minRam":134217728,"maxRam":134217728,"arch":"x86_64","os":"Debian
GNU/Linux 6(64-bit)","bootArgs":" vpccidr=172.28.0.0/16 domain=
z03.cloud.web-et-solutions.com dns1=195.7.111.30 dns2=91.151.119.67
template=domP name=r-18183-VM eth0ip=169.254.3.22 eth0mask=255.255.0.0
type=vpcrouter
disable_rp_filter=true","rebootOnCrash":false,"enableHA":true,"limitCpuUse":false,"vncPassword":"9c5468337802250c","params":{},"uuid":"02f8704c-77dc-487c-b070-408c6e6c1060","disks":[{"id":8069,"name":"ROOT-18183","mountPoint":"/iqn.1984-05.com.dell:powervault.md3200i.6d4ae52000b3a400000000005019e097/6","path":"294080ab-1f18-48e7-9941-d8b449e2fecb","size":2147483648,"type":"ROOT","storagePoolType":"IscsiLUN","storagePoolUuid":"49ddaea6-0f2c-327a-844f-fd3f64f6963c","deviceId":0}],"nics":[{"deviceId":0,"networkRateMbps":-1,"defaultNic":false,"uuid":"c558916e-95f7-415d-9533-88a84b35d8ab","ip":"169.254.3.22","netmask":"255.255.0.0","gateway":"169.254.0.1","mac":"0e:00:a9:fe:03:16","broadcastType":"LinkLocal","type":"Control","isSecurityGroupEnabled":false}]},"wait":0}},{"check.CheckSshCommand":{"ip":"169.254.3.22","port":3922,"interval":6,"retries":100,"name":"r-18183-VM","wait":0}},{"GetDomRVersionCmd":{"accessDetails":{"router.ip":"169.254.3.22","
router.name
":"r-18183-VM"},"wait":0}},{},{"PlugNicCommand":{"nic":{"deviceId":1,"networkRateMbps":200,"defaultNic":true,"uuid":"a4a6aed0-1d0d-457d-8df2-478b462354c4","ip":"10.14.10.36","netmask":"255.254.0.0","gateway":"10.14.0.1","mac":"06:b7:74:00:07:8f","dns1":"195.7.111.30","dns2":"91.151.119.67","broadcastType":"Vlan","type":"Public","broadcastUri":"vlan://2020","isolationUri":"vlan://2020","isSecurityGroupEnabled":false,"name":"ground1"},"instanceName":"r-18183-VM","wait":0}},{"routing.IpAssocVpcCommand":{"ipAddresses":[{"accountId":2,"publicIp":"10.14.10.36","sourceNat":true,"add":true,"oneToOneNat":false,"firstIP":false,"vlanId":"2020","vlanGateway":"10.14.0.1","vlanNetmask":"255.254.0.0","vifMacAddress":"06:b7:74:00:07:8f","networkRate":200,"trafficType":"Public","networkName":"ground1"}],"accessDetails":{"router.guest.ip":"10.14.10.36","zone.network.type":"Advanced","router.ip":"169.254.3.22","
router.name
":"r-18183-VM"},"wait":0}},{"routing.SetSourceNatCommand":{"ipAddress":{"accountId":2,"publicIp":"10.14.10.36","sourceNat":true,"add":true,"oneToOneNat":false,"firstIP":false,"vlanId":"2020","vlanGateway":"10.14.0.1","vlanNetmask":"255.254.0.0","vifMacAddress":"06:b7:74:00:07:8f","networkRate":200,"trafficType":"Public","networkName":"ground1"},"add":true,"accessDetails":{"zone.network.type":"Advanced","router.ip":"169.254.3.22","
router.name
":"r-18183-VM"},"wait":0}},{"PlugNicCommand":{"nic":{"deviceId":2,"networkRateMbps":200,"defaultNic":false,"uuid":"eb0254f5-d484-4d56-86cb-0a6b0c844e6d","ip":"172.28.2.1","netmask":"255.255.255.0","mac":"02:00:6e:c8:00:06","dns1":"195.7.111.30","dns2":"91.151.119.67","broadcastType":"Vlan","type":"Guest","broadcastUri":"vlan://2502","isolationUri":"vlan://2502","isSecurityGroupEnabled":false,"name":"ground1"},"instanceName":"r-18183-VM","wait":0}},{"SetupGuestNetworkCommand":{"dhcpRange":"172.28.2.1","networkDomain":"
z03.cloud.web-et-solutions.com
","defaultDns1":"195.7.111.30","defaultDns2":"91.151.119.67","isRedundant":false,"add":true,"nic":
{"deviceId":2,"networkRateMbps":200,"defaultNic":false,"uuid":"eb0254f5-d484-4d56-86cb-0a6b0c844e6d","ip":"172.28.2.1","netmask":"255.255.255.0","mac":"02:00:6e:c8:00:06","dns1":"195.7.111.30","dns2":"91.151.119.67","broadcastType":"Vlan","type":"Guest","broadcastUri":"vlan://2502","isolationUri":"vlan://2502","isSecurityGroupEnabled":false,"name":"ground1"},"accessDetails":{"router.guest.ip":"172.28.2.1","guest.vlan.tag":"2502","guest.network.gateway":"172.28.2.1","guest.bridge":"172.28.2.255","
router.name":"r-18183-VM","router.ip":"169.254.3.22"},"wait":0}},
{"PlugNicCommand":{"nic":{"deviceId":3,"networkRateMbps":200,"defaultNic":false,"uuid":"f1c50f06-db19-4293-9842-d37bb4d8bba2","ip":"172.28.1.1","netmask":"255.255.255.0","mac":"02:00:20:cc:00:08","dns1":"195.7.111.30","dns2":"91.151.119.67","broadcastType":"Vlan","type":"Guest","broadcastUri":"vlan://2510","isolationUri":"vlan://2510","isSecurityGroupEnabled":false,"name":"ground1"},"instanceName":"r-18183-VM","wait":0}},{"SetupGuestNetworkCommand":{"dhcpRange":"172.28.1.1","networkDomain":"
z03.cloud.web-et-solutions.com
","defaultDns1":"195.7.111.30","defaultDns2":"91.151.119.67","isRedundant":false,"add":true,"nic":{"deviceId":3,"networkRateMbps":200,"defaultNic":false,"uuid":"f1c50f06-db19-4293-9842-d37bb4d8bba2","ip":"172.28.1.1","netmask":"255.255.255.0","mac":"02:00:20:cc:00:08","dns1":"195.7.111.30","dns2":"91.151.119.67","broadcastType":"Vlan","type":"Guest","broadcastUri":"vlan://2510","isolationUri":"vlan://2510","isSecurityGroupEnabled":false,"name":"ground1"},"accessDetails":{"router.guest.ip":"172.28.1.1","guest.vlan.tag":"2510","guest.network.gateway":"172.28.1.1","guest.bridge":"172.28.1.255","
router.name
":"r-18183-VM","router.ip":"169.254.3.22"},"wait":0}},{"PlugNicCommand":{"nic":{"deviceId":4,"networkRateMbps":200,"defaultNic":false,"uuid":"f969ed36-9a5e-4683-8107-b17232347e42","ip":"172.28.3.1","netmask":"255.255.255.0","mac":"02:00:40:42:00:08","dns1":"195.7.111.30","dns2":"91.151.119.67","broadcastType":"Vlan","type":"Guest","broadcastUri":"vlan://2501","isolationUri":"vlan://2501","isSecurityGroupEnabled":false,"name":"ground1"},"instanceName":"r-18183-VM","wait":0}},{"SetupGuestNetworkCommand":{"dhcpRange":"172.28.3.1","networkDomain":"
z03.cloud.web-et-solutions.com
","defaultDns1":"195.7.111.30","defaultDns2":"91.151.119.67","isRedundant":false,"add":true,"nic":{"deviceId":4,"networkRateMbps":200,"defaultNic":false,"uuid":"f969ed36-9a5e-4683-8107-b17232347e42","ip":"172.28.3.1","netmask":"255.255.255.0","mac":"02:00:40:42:00:08","dns1":"195.7.111.30","dns2":"91.151.119.67","broadcastType":"Vlan","type":"Guest","broadcastUri":"vlan://2501","isolationUri":"vlan://2501","isSecurityGroupEnabled":false,"name":"ground1"},"accessDetails":{"router.guest.ip":"172.28.3.1","guest.vlan.tag":"2501","guest.network.gateway":"172.28.3.1","guest.bridge":"172.28.3.255","
router.name
":"r-18183-VM","router.ip":"169.254.3.22"},"wait":0}},{"routing.DhcpEntryCommand":{"vmMac":"02:00:19:85:00:06","vmIpAddress":"172.28.1.241","vmName":"tier-frontal2-test2","defaultRouter":"172.28.1.1","accessDetails":{"router.guest.ip":"172.28.1.1","zone.network.type":"Advanced","
router.name
":"r-18183-VM","router.ip":"169.254.3.22"},"wait":0}},{"routing.VmDataCommand":{"vmIpAddress":"172.28.1.241","vmName":"tier-frontal2-test2","accessDetails":{"router.guest.ip":"172.28.1.1","zone.network.type":"Advanced","router.ip":"169.254.3.22","
router.name":"r-18183-VM"},"wait":0}},{}] }


So I wait for the recreation of my vpc, but i do not have my vlan
interfaces :

I took a look onto my management-server.log and saw this :

2013-02-14 10:31:29,529 DEBUG [xen.resource.XenServerConnectionPool]
(DirectAgent-430:null) XmlRpcException for method: host.call_plugin due to
Failed to create input stream: Read timed out.  Reconnecting...retry=1
2013-02-14 10:31:29,529 DEBUG [xen.resource.CitrixResourceBase]
(DirectAgent-430:null) callHostPlugin failed for cmd: routerProxy with args
args: vpc_guestnw.sh 169.254.3.22 -C -d eth2 -i 172.28.2.1 -g 172.28.2.1 -m
24 -n 172.28.2.0 -s 195.7.111.30,91.151.119.67 -e
z03.cloud.web-et-solutions.com,  due to Failed to create input stream: Read
timed out
2013-02-14 10:31:29,529 WARN  [xen.resource.CitrixResourceBase]
(DirectAgent-430:null) Creating guest network failed due to
com.cloud.utils.exception.CloudRuntimeException: callHostPlugin failed for
cmd: routerProxy with args args: vpc_guestnw.sh 169.254.3.22 -C -d eth2 -i
172.28.2.1 -g 172.28.2.1 -m 24 -n 172.28.2.0 -s 195.7.111.30,91.151.119.67
-e z03.cloud.web-et-solutions.com,  due to Failed to create input stream:
Read timed out
com.cloud.utils.exception.CloudRuntimeException: callHostPlugin failed for
cmd: routerProxy with args args: vpc_guestnw.sh 169.254.3.22 -C -d eth2 -i
172.28.2.1 -g 172.28.2.1 -m 24 -n 172.28.2.0 -s 195.7.111.30,91.151.119.67
-e z03.cloud.web-et-solutions.com,  due to Failed to create input stream:
Read timed out
        at
com.cloud.hypervisor.xen.resource.CitrixResourceBase.callHostPlugin(CitrixResourceBase.java:3749)
        at
com.cloud.hypervisor.xen.resource.CitrixResourceBase.execute(CitrixResourceBase.java:7375)
        at
com.cloud.hypervisor.xen.resource.CitrixResourceBase.executeRequest(CitrixResourceBase.java:541)
        at
com.cloud.hypervisor.xen.resource.XcpServerResource.executeRequest(XcpServerResource.java:52)
        at
com.cloud.agent.manager.DirectAgentAttache$Task.run(DirectAgentAttache.java:191)
        at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:165)
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:266)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:679)
2013-02-14 10:31:29,530 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-430:null) Seq 14-1794712962: Cancelling because one of the
answers is false and it is stop on error.
2013-02-14 10:31:29,530 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-430:null) Seq 14-1794712962: Response Received:
2013-02-14 10:31:29,531 DEBUG [agent.transport.Request]
(DirectAgent-430:null) Seq 14-1794712962: Processing:  { Ans: , MgmtId:
165013918770818, via: 14, Ver: v1, Flags: 110,
[{"StartAnswer":{"vm":{"id":18183,"name":"r-18183-VM","bootloader":"PyGrub","type":"DomainRouter","cpus":1,"speed":500,"minRam":134217728,"maxRam":134217728,"arch":"x86_64","os":"Debian
GNU/Linux 6(64-bit)","bootArgs":" vpccidr=172.28.0.0/16 domain=
z03.cloud.web-et-solutions.com dns1=195.7.111.30 dns2=91.151.119.67
template=domP name=r-18183-VM eth0ip=169.254.3.22 eth0mask=255.255.0.0
type=vpcrouter
disable_rp_filter=true","rebootOnCrash":false,"enableHA":true,"limitCpuUse":false,"vncPassword":"9c5468337802250c","params":{},"uuid":"02f8704c-77dc-487c-b070-408c6e6c1060","disks":[{"id":8069,"name":"ROOT-18183","mountPoint":"/iqn.1984-05.com.dell:powervault.md3200i.6d4ae52000b3a400000000005019e097/6","path":"294080ab-1f18-48e7-9941-d8b449e2fecb","size":2147483648,"type":"ROOT","storagePoolType":"IscsiLUN","storagePoolUuid":"49ddaea6-0f2c-327a-844f-fd3f64f6963c","deviceId":0}],"nics":[{"deviceId":0,"networkRateMbps":-1,"defaultNic":false,"uuid":"c558916e-95f7-415d-9533-88a84b35d8ab","ip":"169.254.3.22","netmask":"255.255.0.0","gateway":"169.254.0.1","mac":"0e:00:a9:fe:03:16","broadcastType":"LinkLocal","type":"Control","isSecurityGroupEnabled":false}]},"result":true,"wait":0}},{"check.CheckSshAnswer":{"result":true,"wait":0}},{"GetDomRVersionAnswer":{"templateVersion":"Cloudstack
Release 3.0 Mon Feb 6 15:10:04 PST
2012","scriptsVersion":"cd49bf69b25051c4dd7751c571c9e6f9","result":true,"details":"Cloudstack
Release 3.0 Mon Feb 6 15:10:04 PST
2012&cd49bf69b25051c4dd7751c571c9e6f9","wait":0}},{"NetworkUsageAnswer":{"routerName":"r-18183-VM","bytesSent":0,"bytesReceived":0,"result":true,"wait":0}},{"PlugNicAnswer":{"result":true,"details":"success","wait":0}},{"routing.IpAssocAnswer":{"results":["10.14.10.36
-
success"],"result":true,"wait":0}},{"routing.SetSourceNatAnswer":{"result":true,"details":"success","wait":0}},{"PlugNicAnswer":{"result":true,"details":"success","wait":0}},{"SetupGuestNetworkAnswer":{"result":false,"details":"Creating
guest network failed due to
com.cloud.utils.exception.CloudRuntimeException: callHostPlugin failed for
cmd: routerProxy with args args: vpc_guestnw.sh 169.254.3.22 -C -d eth2 -i
172.28.2.1 -g 172.28.2.1 -m 24 -n 172.28.2.0 -s 195.7.111.30,91.151.119.67
-e z03.cloud.web-et-solutions.com,  due to Failed to create input stream:
Read timed out","wait":0}}] }


How can i troubleshoot this ?


Thanks a lot for your help.

Regards, Benoit Lair.

2013/2/13 Ahmad Emneina <aemneina@gmail.com>

> Hey Benoit,
>
> I queried some engineers inside the citrix qa team. They told me to try the
> restartVPC api call. This should recreate the router. If it doesnt we
> should log a bug, there should be a way to recover from this easily.
>
>
> https://incubator.apache.org/cloudstack/docs/api/apidocs-4.0.0/root_admin/restartVPC.html
>
>
> On Wed, Feb 13, 2013 at 9:53 AM, benoit lair <kurushi4000@gmail.com>
> wrote:
>
> > Hello Ahmad,
> >
> > I shuttted down the vpc vr during an update of acls sent by the mgmt
> > server, so mgmt server sais me the vpc vr has 3 rules, in reality, an
> > ipatbles -L give me only 2 acls.
> >
> > More simply, when i create a new vpc, create 3 tiers netoworks, deployed
> a
> > vm into each tier network, all seems to be ok. Problems begin when i
> simply
> > reboot the vpc vr.
> >
> > After a reboot, i have no more the interfaces of my tier (vlans)
> interfaces
> > mounted.
> >
> > Even if i reboot each tier network, i can't recover access to my vlans so
> > no access to my vms.
> >
> > In fact, i only have the control interface (1698.254.x.x) the public
> > interface (in my case 10.14.10.36) and the localhost one.
> >
> > My master cidr is 172.28.0.0/16, my tiers are 172.28.1/24,
> > 172.28.2.1/24and
> > 172.28.3.1/24, before reboot i got 3 interfaces : 172.28.1.1, 172.28.2.1
> > and 172.28.3.1 after reboot no more interfaces...
> >
> > I have already tried to destroy the vpc vr, triggered the creation of a
> new
> > vm to force cs to detect that vpc vr is missing and so relaunch the
> > installation of the vpc vr.
> >
> > Even in this case, the vpc is redeployed with only mgmt , public and
> > loopback interface, no vlan interfaces.
> >
> > Have you already encountered this problem ?
> >
> > 2013/2/11 Anthony Xu <Xuefei.Xu@citrix.com>
> >
> > > You can try restart network with clean checked in UI, it will try to
> > > destroy Virtual router on this network and create a new one.
> > >
> > >
> > > Anthony
> > >
> > > > -----Original Message-----
> > > > From: Ahmad Emneina [mailto:aemneina@gmail.com]
> > > > Sent: Monday, February 11, 2013 10:48 AM
> > > > To: Cloudstack users
> > > > Subject: Re: Production recovery procedure for VPC Virtual router
> into
> > > > advanced zone - CS 4.0.0
> > > >
> > > > how did you crash the vpc router?
> > > >
> > > > did spinning up a vm, or restarting a vm, in the affected network
> > > > restore
> > > > the router? (this is probably the easiest way to correct the issue,
> if
> > > > cloudstack understands the router is gone.)
> > > >
> > > > I would only spin up a vm after a destroy router is called, insuring
> > > > cloudstack understands that router needs to be recreated.
> > > >
> > > >
> > > > On Mon, Feb 11, 2013 at 8:17 AM, benoit lair <kurushi4000@gmail.com>
> > > > wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > I am currently working on a model of preproduction  of cs 4.0.0 on
> a
> > > > centos
> > > > > 6.3 server management.
> > > > >
> > > > > I have deployed my zones, networks, primary iscsi storages onto my
> > > > array,
> > > > > configured my clusters with xcp 1.1.
> > > > >
> > > > > Now i'm looking for deploying the VPC features.
> > > > >
> > > > > I have configured my vpc, defined my CIDR, created 3 tiers
> networks,
> > > > > deployed 1 vm onto each tier network.
> > > > >
> > > > > All worked very great.
> > > > >
> > > > > Now i've tried to force crash of the vpc virtual router in order
to
> > > > > troubleshooting the recovery of production of the vpc.
> > > > >
> > > > > I got some problems in certain cases, after crashing my vpc router,
> > > > my tier
> > > > > networks didn't mount onto the virtual router.
> > > > >
> > > > > This caused lossing the avalibility to recover production of my
> > > > tiered
> > > > > networking vms.
> > > > >
> > > > > So i'm looking for differents recovery procedures in order to
> recover
> > > > a
> > > > > full fonctionnal vpc cloud.
> > > > >
> > > > > If somebody got best practices in order to recover a vpc router,
i
> > > > would
> > > > > appreciate any help.
> > > > >
> > > > >
> > > > > Thanks.
> > > > >
> > > > > Benoit Lair.
> > > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message