cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrija Panic <andrija.pa...@gmail.com>
Subject Re: ACS 4.5.1 KVM live migration problem
Date Fri, 15 May 2015 13:01:30 GMT
Ok, but since they are guest, it confuses me - is this advanced zone with
vlan, right ? Then my understanding all NICs (of user VM) needs to have
some isolation method...

Anyway - I'm running advanced zone  + vlans, and all VMS (VMs behind VPC
and VMS on internet/public network - but still that's Guest network) -
still all of them have some vlan://xxxxx value.

For VR, SSVM, CPVM - there are NICs on "ACS public" network that doesnt use
vlan - they have "vlan://untagged", and "NULL" is only used for LinkLocal
(169.x) NICs, and for mgmt/sec-storage NIC for SSVM/CPVM in my case.



On 15 May 2015 at 13:47, Andrei Mikhailovsky <andrei@arhont.com> wrote:

> Andrija,
>
> I've ran the command and it showed me a bunch of running vms with NULLs. I
> would roughly say about 20% of my total running vms do have NULL under the
> isolation and broadcast URIs.
>
> All of these vms are working perfectly well (in terms of network
> connectivity) and there is nothing special about them. They all have at
> least one guest NIC.
>
> Andrei
> ----- Original Message -----
>
> From: "Andrija Panic" <andrija.panic@gmail.com>
> To: dev@cloudstack.apache.org
> Cc: users@cloudstack.apache.org
> Sent: Friday, 15 May, 2015 12:34:24 PM
> Subject: Re: ACS 4.5.1 KVM live migration problem
>
> Andrei,
>
> select instance_id,isolation_uri,broadcast_uri from nics where instance_id
> in (select id from vm_instance where state='Running' and name not like
> 'r-%' and name not like 'v-%' and name not like 's-%') order by
> instance_id;
>
> This gives me every niC, that does not belong to router or SSVm CPVM....I
> always have vlan values - since this is all Guest NICs - they must have
> vlan ID...
> NULL values are only present when VM is deleted/stoped in my case...
>
> Can you check your VM 664 - what is so specific about it ?
> all NICs (in my understanding, if this is advacned zone) must have some
> vlan, can not be NULL or untagged ?
>
> On 15 May 2015 at 12:58, Andrei Mikhailovsky <andrei@arhont.com> wrote:
>
> >
> >
> > Hi Andrija, Marcus,
> >
> > Thanks for your comments and suggestions. I've checked the cloud.nics
> table
> >
> > mysql> select instance_id,isolation_uri,broadcast_uri from nics where
> > instance_id=564 or instance_id=664 or instance_id=1111;
> > +-------------+---------------+---------------+
> > | instance_id | isolation_uri | broadcast_uri |
> > +-------------+---------------+---------------+
> > | 564 | vlan://96 | vlan://96 |
> > | 664 | NULL | NULL |
> > | 1111 | vlan://1127 | vlan://1127 |
> > +-------------+---------------+---------------+
> >
> >
> > From my tests, instance_ids 564 and 1111 are migrating correctly, but
> > instance 664 is not ans showing the npe similar to the one i've given.
> >
> >
> > Is this what is causing the migration issues? If so, should i change all
> > isolation_uri and broadcast_uri to the corresponding network vlan ids?
> >
> > Thanks
> >
> > Andrei
> >
> > ----- Original Message -----
> >
> > From: "Andrija Panic" <andrija.panic@gmail.com>
> > To: dev@cloudstack.apache.org
> > Sent: Thursday, 14 May, 2015 4:00:07 PM
> > Subject: Re: Fwd: ACS 4.5.1 KVM live migration problem
> >
> > That would probably be a bug that I had...but we updated main VLAN table
> > with change URI or something... Marcus saved me that time :)
> > Andrei, please provide more info and the info Marcus said, I will try to
> > compare my values with yours if of any help.
> >
> > On 14 May 2015 at 16:56, Marcus <shadowsor@gmail.com> wrote:
> >
> > > So, I vaguely remember an issue introduced a little over a year ago
> where
> > > the broadcast domain value of the nic was changed from a URI to just a
> > vlan
> > > ID, which worked for vlans but broke vxlan and some other things. If I
> > > remember correctly, there would be a small set of installs during this
> > > period that wouldn't have created their nics with the correct broadcast
> > > domain value. I don't remember which versions were doing this but I do
> > know
> > > there's a JIRA ticket and a paper trail on how people were fixing it.
> The
> > > code that broke the URI was backed out. VMs created with the bad code
> > would
> > > not be compatible with the new or the old versions of code.
> > >
> > > I was under the impression at the time that there was some SQL provided
> > to
> > > update the values during an upgrade, perhaps that never made it in, or
> > > somehow got skipped during your upgrade process. At any rate, since
> there
> > > is a null pointer on broadcast domain type, you may check your
> > > nics/networks the MySQL db and verify that the broadcast/isolation
> types
> > > are URI format and not just a number. Or try to find the bug I'm
> > referring
> > > to from around April last year.
> > > On May 14, 2015 5:04 AM, "Andrei Mikhailovsky" <andrei@arhont.com>
> > wrote:
> > >
> > > > Hi guys,
> > > >
> > > > Forwarding the message to the dev list as ive not had much reply in
> the
> > > > users list.
> > > >
> > > > In summary. after upgrading from ASC4.4.2 ro 4.5.1 i started having
> > > > migration issues with a lot of vms. some vms are successfully
> migrating
> > > and
> > > > others are not .
> > > >
> > > > The logs are shown below
> > > >
> > > > could someone help me to get to the bottom of this problem?
> > > >
> > > > Thanks
> > > >
> > > > Andrei
> > > >
> > > >
> > > >
> > > > ----- Forwarded Message -----
> > > > From: "Andrei Mikhailovsky" <andrei@arhont.com>
> > > > To: users@cloudstack.apache.org
> > > > Sent: Wednesday, 13 May, 2015 10:44:29 AM
> > > > Subject: Re: ACS 4.5.1 KVM live migration problem
> > > >
> > > > Hi Rohit,
> > > >
> > > > forgot to answer you on the cloud.vlan table.
> > > >
> > > > That particular vm has a network with vlan id 1151 as shown when i
> look
> > > at
> > > > the network details in the acs gui. However, this vlan is not shown
> in
> > > the
> > > > cloud.vlan table. From what I can see the cloud.vlan table shows only
> > the
> > > > public and management network vlan interfaces and does not show the
> > guest
> > > > network vlans.
> > > >
> > > > In terms of the public network vlan which is used for routing traffic
> > to
> > > > the internet from this particular vm, it is:
> > > >
> > > >
> > > > mysql> select * from vlan where id=12;
> > > >
> > > >
> > >
> >
> +----+--------------------------------------+-------------+---------------+-----------------+-------------------------------+----------------+----------------+------------+---------------------+-------------+----------+-----------+---------+---------+
> > > > | id | uuid | vlan_id | vlan_gateway | vlan_netmask | description |
> > > > vlan_type | data_center_id | network_id | physical_network_id |
> > > ip6_gateway
> > > > | ip6_cidr | ip6_range | removed | created |
> > > >
> > > >
> > >
> >
> +----+--------------------------------------+-------------+---------------+-----------------+-------------------------------+----------------+----------------+------------+---------------------+-------------+----------+-----------+---------+---------+
> > > > | 12 | d13ea4b3-2087-4376-9d0a-f54efe2a55af | vlan://2030 |
> > 178.XXX.XXX.1
> > > > | 255.255.255.128 | 178.XXX.XXX.2-178.XXX.XXX.119 | VirtualNetwork |
> 1
> > |
> > > > 200 | 200 | NULL | NULL | NULL | NULL | NULL |
> > > >
> > > >
> > >
> >
> +----+--------------------------------------+-------------+---------------+-----------------+-------------------------------+----------------+----------------+------------+---------------------+-------------+----------+-----------+---------+---------+
> > > > 1 row in set (0.00 sec)
> > > >
> > > >
> > > > Hope that helps
> > > >
> > > > Andrei
> > > > ----- Original Message -----
> > > >
> > > > From: "Rohit Yadav" <rohit.yadav@shapeblue.com>
> > > > To: users@cloudstack.apache.org
> > > > Sent: Wednesday, 13 May, 2015 8:55:55 AM
> > > > Subject: Re: ACS 4.5.1 KVM live migration problem
> > > >
> > > > Hi Andrei,
> > > >
> > > > This looks like an issue similar to
> > > > https://issues.apache.org/jira/browse/CLOUDSTACK-6893
> > > > Can share the row from your cloud.vlan table and value of “select
> > > > cache_mode from volume_view where vm_id=<put the vm id here>\G;"
for
> > the
> > > VM
> > > > causing the NPE?
> > > >
> > > > > On 12-May-2015, at 10:51 pm, Andrei Mikhailovsky <
> andrei@arhont.com>
> > > > wrote:
> > > > >
> > > > >
> > > > >
> > > > > It seems that the problem is worse than i've initially thought. In
> > > fact,
> > > > I can't migrate most of my vms apart from a handful and I can't
> > > determine a
> > > > correlation between the migrateable vms and once that produce
> > exception.
> > > > >
> > > > > Thanks for any help.
> > > > >
> > > > > Andrei
> > > > >
> > > > > ----- Original Message -----
> > > > >
> > > > > From: "Andrei Mikhailovsky" <andrei@arhont.com>
> > > > > To: users@cloudstack.apache.org
> > > > > Sent: Tuesday, 12 May, 2015 8:53:16 PM
> > > > > Subject: ACS 4.5.1 KVM live migration problem
> > > > >
> > > > > Hi,
> > > > >
> > > > > I am having an issue migrating some of vms after recently upgrading
> > to
> > > > ACS 4.5.1. I am running Ubuntu 14.04 on both host and management
> > servers.
> > > > Here is the output from the log file on a client agent :
> > > > >
> > > > >
> > > > > 2015-05-12 20:42:34,154 DEBUG
> [kvm.resource.LibvirtComputingResource]
> > > > (agentRequest-Handler-1:null) Preparing host for migrating
> > > > com.cloud.agent.api.to.VirtualMachineTO@21a038ac
> > > > > 2015-05-12 20:42:34,157 DEBUG [kvm.resource.LibvirtConnection]
> > > > (agentRequest-Handler-1:null) can't find connection: KVM, for vm:
> > > > i-9-1162-VM, continue
> > > > > 2015-05-12 20:42:34,159 DEBUG [kvm.resource.LibvirtConnection]
> > > > (agentRequest-Handler-1:null) can't find connection: LXC, for vm:
> > > > i-9-1162-VM, continue
> > > > > 2015-05-12 20:42:34,159 DEBUG [kvm.resource.LibvirtConnection]
> > > > (agentRequest-Handler-1:null) can't find which hypervisor the vm
> used ,
> > > > then use the default hypervisor
> > > > > 2015-05-12 20:42:34,160 DEBUG [kvm.resource.BridgeVifDriver]
> > > > (agentRequest-Handler-1:null)
> > nic=[Nic:Guest-178.248.108.205-vlan://2014]
> > > > > 2015-05-12 20:42:34,160 DEBUG [kvm.resource.BridgeVifDriver]
> > > > (agentRequest-Handler-1:null) creating a vNet dev and bridge for
> guest
> > > > traffic per traffic label cloudstackbr0
> > > > > 2015-05-12 20:42:34,160 DEBUG [kvm.resource.BridgeVifDriver]
> > > > (agentRequest-Handler-1:null) Executing:
> > > > /usr/share/cloudstack-common/scripts/vm/network/vnet/modifyvlan.sh -v
> > > 2014
> > > > -p bond0 -b brbond0-2014 -o add
> > > > > 2015-05-12 20:42:34,211 DEBUG [kvm.resource.BridgeVifDriver]
> > > > (agentRequest-Handler-1:null) Execution is successful.
> > > > > 2015-05-12 20:42:34,211 DEBUG [kvm.resource.BridgeVifDriver]
> > > > (agentRequest-Handler-1:null) nic=[Nic:Guest-10.1.1.66-null]
> > > > > 2015-05-12 20:42:34,212 DEBUG [kvm.storage.KVMStoragePoolManager]
> > > > (agentRequest-Handler-1:null) Disconnecting disk
> > > > 23add201-e4ee-447b-a448-ecd152aea4ad
> > > > > 2015-05-12 20:42:34,212 DEBUG [kvm.storage.LibvirtStorageAdaptor]
> > > > (agentRequest-Handler-1:null) Trying to fetch storage pool
> > > > cf771bc7-8998-354d-8e10-5564585a3c20 from libvirt
> > > > > 2015-05-12 20:42:34,223 DEBUG [kvm.storage.KVMStoragePoolManager]
> > > > (agentRequest-Handler-1:null) Disconnecting disk
> > > > 55100d25-410e-4fa3-a38b-7717f74d2afe
> > > > > 2015-05-12 20:42:34,223 DEBUG [kvm.storage.LibvirtStorageAdaptor]
> > > > (agentRequest-Handler-1:null) Trying to fetch storage pool
> > > > cf771bc7-8998-354d-8e10-5564585a3c20 from libvirt
> > > > > 2015-05-12 20:42:34,232 DEBUG [kvm.storage.KVMStoragePoolManager]
> > > > (agentRequest-Handler-1:null) Disconnecting disk
> > > > 2db59d16-d17f-49a1-b913-7fbe4025a549
> > > > > 2015-05-12 20:42:34,233 DEBUG [kvm.storage.LibvirtStorageAdaptor]
> > > > (agentRequest-Handler-1:null) Trying to fetch storage pool
> > > > cf771bc7-8998-354d-8e10-5564585a3c20 from libvirt
> > > > > 2015-05-12 20:42:34,243 DEBUG [kvm.storage.KVMStoragePoolManager]
> > > > (agentRequest-Handler-1:null) Disconnecting disk
> > > > 17afbf31-ac89-46f7-a2c8-f8aed796e4c6
> > > > > 2015-05-12 20:42:34,243 DEBUG [kvm.storage.LibvirtStorageAdaptor]
> > > > (agentRequest-Handler-1:null) Trying to fetch storage pool
> > > > d8d5ec36-3cb0-39af-8fc6-084a4abd5d28 from libvirt
> > > > > 2015-05-12 20:42:34,254 WARN [cloud.agent.Agent]
> > > > (agentRequest-Handler-1:null) Caught:
> > > > > java.lang.NullPointerException
> > > > > at
> > > >
> > >
> >
> com.cloud.network.Networks$BroadcastDomainType.getSchemeValue(Networks.java:172)
> > > > > at
> > > >
> > >
> >
> com.cloud.network.Networks$BroadcastDomainType.getValue(Networks.java:226)
> > > > > at
> > > >
> > >
> >
> com.cloud.hypervisor.kvm.resource.BridgeVifDriver.plug(BridgeVifDriver.java:105)
> > > > > at
> > > >
> > >
> >
> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.execute(LibvirtComputingResource.java:3230)
> > > > > at
> > > >
> > >
> >
> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1307)
> > > > > at com.cloud.agent.Agent.processRequest(Agent.java:503)
> > > > > at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:808)
> > > > > at com.cloud.utils.nio.Task.run(Task.java:84)
> > > > > at
> > > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> > > > > at
> > > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> > > > > at java.lang.Thread.run(Thread.java:745)
> > > > > 2015-05-12 20:42:34,256 DEBUG [cloud.agent.Agent]
> > > > (agentRequest-Handler-1:null) Seq 7-7525233502359390941: { Ans: ,
> > MgmtId:
> > > > 115129173025118, via: 7, Ver: v1, Flags: 110,
> > > >
> > >
> >
> [{"com.cloud.agent.api.Answer":{"result":false,"details":"java.lang.NullPointerException\n\tat
> > > >
> > >
> >
> com.cloud.network.Networks$BroadcastDomainType.getSchemeValue(Networks.java:172)\n\tat
> > > >
> > >
> >
> com.cloud.network.Networks$BroadcastDomainType.getValue(Networks.java:226)\n\tat
> > > >
> > >
> >
> com.cloud.hypervisor.kvm.resource.BridgeVifDriver.plug(BridgeVifDriver.java:105)\n\tat
> > > >
> > >
> >
> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.execute(LibvirtComputingResource.java:3230)\n\tat
> > > >
> > >
> >
> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1307)\n\tat
> > > > com.cloud.agent.Agent.processRequest(Agent.java:503)\n\tat
> > > >
> com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:808)\n\tat
> > > > com.cloud.utils.nio.Task.run(Task.java:84)\n\tat
> > > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)\n\tat
> > > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)\n\tat
> > > > java.lang.Thread.run(Thread.java:745)\n","wait":0}}] }
> > > > >
> > > > >
> > > > >
> > > > > Any idea how to get this fixed? Not sure why all of a sudden the
> > > > migration stopped working for a handful of vms. I can successfully
> > > migrate
> > > > some vms, but not others.
> > > > >
> > > > > Thanks
> > > > >
> > > > > Andrei
> > > > >
> > > > >
> > > >
> > > > Regards,
> > > > Rohit Yadav
> > > > Software Architect, ShapeBlue
> > > > M. +91 88 262 30892 | rohit.yadav@shapeblue.com
> > > > Blog: bhaisaab.org | Twitter: @_bhaisaab
> > > >
> > > >
> > > >
> > > > Find out more about ShapeBlue and our range of CloudStack related
> > > services
> > > >
> > > > IaaS Cloud Design & Build<
> > > > http://shapeblue.com/iaas-cloud-design-and-build//>
> > > > CSForge – rapid IaaS deployment framework<
> > http://shapeblue.com/csforge/>
> > > > CloudStack Consulting<http://shapeblue.com/cloudstack-consultancy/>
> > > > CloudStack Software Engineering<
> > > > http://shapeblue.com/cloudstack-software-engineering/>
> > > > CloudStack Infrastructure Support<
> > > > http://shapeblue.com/cloudstack-infrastructure-support/>
> > > > CloudStack Bootcamp Training Courses<
> > > > http://shapeblue.com/cloudstack-training/>
> > > >
> > > > This email and any attachments to it may be confidential and are
> > intended
> > > > solely for the use of the individual to whom it is addressed. Any
> views
> > > or
> > > > opinions expressed are solely those of the author and do not
> > necessarily
> > > > represent those of Shape Blue Ltd or related companies. If you are
> not
> > > the
> > > > intended recipient of this email, you must neither take any action
> > based
> > > > upon its contents, nor copy or show it to anyone. Please contact the
> > > sender
> > > > if you believe you have received this email in error. Shape Blue Ltd
> > is a
> > > > company incorporated in England & Wales. ShapeBlue Services India
LLP
> > is
> > > a
> > > > company incorporated in India and is operated under license from
> Shape
> > > Blue
> > > > Ltd. Shape Blue Brasil Consultoria Ltda is a company incorporated in
> > > Brasil
> > > > and is operated under license from Shape Blue Ltd. ShapeBlue SA Pty
> Ltd
> > > is
> > > > a company registered by The Republic of South Africa and is traded
> > under
> > > > license from Shape Blue Ltd. ShapeBlue is a registered trademark.
> > > >
> > > >
> > >
> >
> >
> >
> > --
> >
> > Andrija Panić
> >
> >
>
>
> --
>
> Andrija Panić
>
>


-- 

Andrija Panić

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message