cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rafael Weingärtner <rafaelweingart...@gmail.com>
Subject Re: Unable to start agent after roll-back to previous ACS version
Date Thu, 20 Oct 2016 14:11:03 GMT
I think the source code of 4.9 will be more or less the same.

I missed another bit of code. Lines 1087 – 1094. “if (_pifs.get("private")
== null)”.

It tries to look for a file at “/sys/class/net/" + _guestBridgeName. If it
exist it executes “_pifs.put("private", _guestBridgeName)”. Does that exist
for you?

There is also another piece of code at 1059-1071 that may also be used to
get the private PIF name. For that, it lists the files from
“sys/devices/virtual/net”.

Can you list directories “/sys/class/net/" and “/sys/devices/virtual/net”?

I will now check the source code for ACS 4.2, and see if it is the same.

On Thu, Oct 20, 2016 at 12:01 PM, Cloud List <cloud-list@sg.or.id> wrote:

> Hi Rafael,
>
> Thanks for your reply.
>
> Here's the output of the command:
>
> root@test-kvm-03:/var/log/cloudstack/agent# ovs-vsctl list-br | sed
> '{:q;N;s/\\n/%/g;t q}'
> The program 'ovs-vsctl' is currently not installed.  You can install it by
> typing:
> apt-get install openvswitch-switch
>
> I believe the command is only applicable if we are using OpenVSwitch. We
> are using the normal Ubuntu network bridges rather than using OpenVSwitch.
> Furthermore, we are trying to roll back to 4.2 and this issue happens when
> I want to start the agent after I downgraded the agent to 4.2. Shouldn't we
> be checking the 4.2 source code instead?
>
> Looking forward to your reply, thank you.
>
> Cheers.
>
>
> On Thu, Oct 20, 2016 at 8:38 PM, Rafael Weingärtner <
> rafaelweingartner@gmail.com> wrote:
>
> > Hi, Anonymous fellow ;)
> > Let’s see if I can help you a little bit. I am checking the ACS 4.9
> source
> > code.
> >
> > The error is thrown at line 896 of class
> > “com.cloud.hypervisor.kvm.resource.LibvirtComputingResource”.
> > The condition that causes the error is “_pifs.get("private") == null”.
> > “_pifs” if a map. The key “private” is added to the map at line 1124, if
> > condition “_guestBridgeName != null && bridge.equals(_guestBridgeName)”
> is
> > met.
> > The variable “_guestBridgeName” is a String that can receive the value of
> > “guest.network.device” parameter or “_privBridgeName” variable. This
> > process happens at lines 752-755. The process of assigning a value to
> > “_privBridgeName” happens at lines 747-750. The default value for
> > “_privBridgeName” is “cloudbr1”. The default can be overridden by
> > “private.network.device” parameter.
> >
> > Having detailed the parameter. Let's see how ACS gets the “bridge” value.
> > It gets that value from code at line 1113 “cmdout.split("%")”. The
> variable
> > “cmdout” contains the output of the following OS command: “ovs-vsctl
> > list-br | sed '{:q;N;s/\\n/%/g;t q}'”.
> >
> > Can you run the command and check its output?
> >
> >
> > On Thu, Oct 20, 2016 at 8:49 AM, Cloud List <cloud-list@sg.or.id> wrote:
> >
> > > Hi,
> > >
> > > We are using ACS version 4.2 / 4.9 on our test environment. We are
> using
> > > Ubuntu 12.04 as the operating system and KVM as the hypervisor.
> > >
> > > We are trying to simulate an upgrade from ACS 4.2 to 4.9 and roll-back
> > from
> > > 4.9 to 4.2 on our test environment. The upgrade went smooth, and the
> > > roll-back went well as well except when we need to start the agent
> after
> > > downgrading the agent.
> > >
> > > After uninstalling cloudstack-agent version 4.9 and installing back
> > > cloudstack-agent version 4.2, I am not able to start the agent with
> below
> > > error messages:
> > >
> > > ====
> > > 2016-10-20 17:32:28,187 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > > cloud0/bridge
> > > 2016-10-20 17:32:28,187 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) Found bridge cloud0
> > > 2016-10-20 17:32:28,188 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/lo/
> bridge
> > > 2016-10-20 17:32:28,188 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file
> > > /sys/devices/virtual/net/cloudbr1/bridge
> > > 2016-10-20 17:32:28,188 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) Found bridge cloudbr1
> > > 2016-10-20 17:32:28,188 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > vnet4/bridge
> > > 2016-10-20 17:32:28,188 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > vnet5/bridge
> > > 2016-10-20 17:32:28,188 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > vnet6/bridge
> > > 2016-10-20 17:32:28,188 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > vnet7/bridge
> > > 2016-10-20 17:32:28,188 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > vnet8/bridge
> > > 2016-10-20 17:32:28,189 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > vnet9/bridge
> > > 2016-10-20 17:32:28,189 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > vnet0/bridge
> > > 2016-10-20 17:32:28,189 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > vnet1/bridge
> > > 2016-10-20 17:32:28,189 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > vnet2/bridge
> > > 2016-10-20 17:32:28,189 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > vnet3/bridge
> > > 2016-10-20 17:32:28,189 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > > virbr0/bridge
> > > 2016-10-20 17:32:28,189 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) Found bridge virbr0
> > > 2016-10-20 17:32:28,189 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > > vnet11/bridge
> > > 2016-10-20 17:32:28,190 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > > vnet12/bridge
> > > 2016-10-20 17:32:28,190 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > > vnet13/bridge
> > > 2016-10-20 17:32:28,190 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > > vnet14/bridge
> > > 2016-10-20 17:32:28,190 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > > vnet10/bridge
> > > 2016-10-20 17:32:28,190 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking for pif for bridge cloud0
> > > 2016-10-20 17:32:28,190 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) matchPifFileInDirectory: file name 'vnet1'
> > > 2016-10-20 17:32:28,191 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) matchPifFileInDirectory: file name 'vnet2'
> > > 2016-10-20 17:32:28,191 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) matchPifFileInDirectory: file name 'vnet5'
> > > 2016-10-20 17:32:28,191 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) failing to get physical interface from bridge
> > cloud0,
> > > did not find an eth*, bond*, vlan*, em*, or p*p* in
> > > /sys/devices/virtual/net/cloud0/brif
> > > 2016-10-20 17:32:28,191 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking for pif for bridge cloudbr1
> > > 2016-10-20 17:32:28,191 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) matchPifFileInDirectory: file name 'eth1'
> > > 2016-10-20 17:32:28,192 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking for pif for bridge virbr0
> > > 2016-10-20 17:32:28,192 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) failing to get physical interface from bridge
> > virbr0,
> > > did not find an eth*, bond*, vlan*, em*, or p*p* in
> > > /sys/devices/virtual/net/virbr0/brif
> > > 2016-10-20 17:32:28,192 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) done looking for pifs, no more bridges
> > > 2016-10-20 17:32:28,192 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) Failed to get private nic name
> > > 2016-10-20 17:32:28,192 ERROR [cloud.agent.AgentShell] (main:null)
> > (logid:)
> > > Unable to start agent: Failed to get private nic name
> > > ====
> > >
> > > Below is the result of brctl show and the content of
> > > /etc/network/interfaces:
> > >
> > > ====
> > > root@test-kvm-03:/var/log/cloudstack/agent# brctl show
> > > bridge name     bridge id               STP enabled     interfaces
> > > cloud0          8000.fe00a9fe00f8       no              vnet1
> > >                                                         vnet2
> > >                                                         vnet5
> > > cloudbr1                8000.d067e5ec82c0       no              eth1
> > >                                                         vnet0
> > >                                                         vnet10
> > >                                                         vnet11
> > >                                                         vnet12
> > >                                                         vnet13
> > >                                                         vnet14
> > >                                                         vnet3
> > >                                                         vnet4
> > >                                                         vnet6
> > >                                                         vnet7
> > >                                                         vnet8
> > >                                                         vnet9
> > > virbr0          8000.000000000000       yes
> > > ====
> > >
> > > /etc/network/interfaces:
> > >
> > > ====
> > > # The loopback network interface
> > > auto lo
> > > iface lo inet loopback
> > >
> > > auto eth1
> > > #iface eth1 inet static
> > > iface eth1 inet manual
> > >
> > > auto cloudbr1
> > > iface cloudbr1 inet static
> > > bridge_ports eth1
> > >         address 192.168.0.201
> > >         netmask 255.255.255.0
> > >         network 192.168.0.0
> > >         broadcast 192.168.0.255
> > >         gateway 192.168.0.1
> > >         dns-nameservers 8.8.8.8 8.8.4.4
> > >         dns-search xxxxx.com
> > >
> > > auto cloudbr1:0
> > > iface cloudbr1:0 inet static
> > >         address 192.168.3.201
> > >         netmask 255.255.255.0
> > >         network 192.168.3.0
> > >         broadcast 192.168.3.255
> > > ====
> > >
> > > It seems that the error messages are complaining about no physical
> > > interface being added into cloud0 and virbr0 bridges. I tried to add
> > eth1,
> > > cloudbr1 into the bridges but it didn't work. Deleting the cloud0 and
> > > virbr0 bridges doesn't help either, agent is still complaining about
> > cannot
> > > find "pifs":
> > >
> > > ====
> > > 2016-10-20 18:00:48,331 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/lo/
> bridge
> > > 2016-10-20 18:00:48,332 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file
> > > /sys/devices/virtual/net/cloudbr1/bridge
> > > 2016-10-20 18:00:48,332 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) Found bridge cloudbr1
> > > 2016-10-20 18:00:48,332 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > vnet4/bridge
> > > 2016-10-20 18:00:48,332 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > vnet5/bridge
> > > 2016-10-20 18:00:48,332 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > vnet6/bridge
> > > 2016-10-20 18:00:48,332 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > vnet7/bridge
> > > 2016-10-20 18:00:48,332 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > vnet8/bridge
> > > 2016-10-20 18:00:48,332 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > vnet9/bridge
> > > 2016-10-20 18:00:48,333 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > vnet0/bridge
> > > 2016-10-20 18:00:48,333 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > vnet1/bridge
> > > 2016-10-20 18:00:48,333 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > vnet2/bridge
> > > 2016-10-20 18:00:48,333 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > vnet3/bridge
> > > 2016-10-20 18:00:48,333 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > > vnet11/bridge
> > > 2016-10-20 18:00:48,333 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > > vnet12/bridge
> > > 2016-10-20 18:00:48,333 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > > vnet13/bridge
> > > 2016-10-20 18:00:48,333 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > > vnet14/bridge
> > > 2016-10-20 18:00:48,334 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking in file /sys/devices/virtual/net/
> > > vnet10/bridge
> > > 2016-10-20 18:00:48,334 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) looking for pif for bridge cloudbr1
> > > 2016-10-20 18:00:48,334 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) matchPifFileInDirectory: file name 'eth1'
> > > 2016-10-20 18:00:48,334 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) done looking for pifs, no more bridges
> > > 2016-10-20 18:00:48,334 DEBUG [kvm.resource.LibvirtComputingResource]
> > > (main:null) (logid:) Failed to get private nic name
> > > 2016-10-20 18:00:48,334 ERROR [cloud.agent.AgentShell] (main:null)
> > (logid:)
> > > Unable to start agent: Failed to get private nic name
> > > ====
> > >
> > > I understand that the required bridge information is supposed to be
> added
> > > by CloudStack during the time when the host is added. Is there a way
> how
> > I
> > > can add the bridge information again manually without having to delete
> > and
> > > re-add the host into CloudStack? The reason is because we want to keep
> > the
> > > VMs running during the downgrade, deleting and re-adding the host into
> > > CloudStack will shutdown the VMs.
> > >
> > > Any advice is greatly appreciated.
> > >
> > > Thank you.
> > >
> > > -ip-
> > >
> >
> >
> >
> > --
> > Rafael Weingärtner
> >
>



-- 
Rafael Weingärtner

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message