cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Javier Rodriguez <jrodrig...@avalonbiometrics.com>
Subject cant add host to cloud: "Nics are not configured!" / "Failed to get public nic name"
Date Mon, 20 May 2013 09:25:00 GMT
Hi all,

I'm testing CloudStack in two of our servers and I'm finding trouble 
starting the agent in the hypervisor node.

Heres a summary of my configuration:

0 -  Nw infrastructure:

   subnet: 172.16.0.0/16
   gateway 172.16.0.1
   DNS1: 192.168.1.204


1 - Cloud management server:
     HW: Dell PowerEdge R320
     OS: CentOS 6.4
     IP: 172.16.2.2/16
     hostname: morpheus  (morpheus.biometrics.local)

2 - Hypervisor node:
      HW: Dell PowerEdge R320
     OS: CentOS 6.4
     IP: 172.16.2.3/16
     Hypervisor type: KVM
     hostname: mnode-1 (mnode-1.biometrics.local)


Everything worked seamlessly on the cloud manager's side, and I didn't 
see any error while following the steps for the kvm hypervisor node 
described in the install manual. In the manager console, I was able to 
add the zone, pod and cluster, but when I tried to create the host, it 
spent several minutes "creating" and then I got a popup alert with the 
error: "Unable to add the host".

management-server.log in morpheus :
=======================

  2013-05-20 11:52:05,403 DEBUG [cloud.api.ApiServlet] 
(catalina-exec-25:null) ===START===  192.168.1.187 -- GET 
command=addHost&zoneid=ac6c892c-8a9d-494b-8d3c-c612263059a0&podid=a4dad17d-f8dc-4734-a6b2-46954144b954&clusterid=b808e253-3aa2-4372-8520-84d8cda88c27&hypervisor=KVM&clustertype=CloudManaged&hosttags=&username=root&url=http%3A%2F%2Fmnode-1&response=json&sessionkey=%2BcU40ygVQMKpfKAI9il7SspVnBk%3D&_=1369039922245
2013-05-20 11:52:05,413 INFO  [cloud.resource.ResourceManagerImpl] 
(catalina-exec-25:null) Trying to add a new host at http://mnode-1 in 
data center 2
2013-05-20 11:52:05,665 DEBUG [utils.ssh.SSHCmdHelper] 
(catalina-exec-25:null) Executing cmd: lsmod|grep kvm
2013-05-20 11:52:06,797 DEBUG [utils.ssh.SSHCmdHelper] 
(catalina-exec-25:null) lsmod|grep kvm output:kvm_intel 53484  0
kvm                   316602  1 kvm_intel

2013-05-20 11:52:07,812 DEBUG [utils.ssh.SSHCmdHelper] 
(catalina-exec-25:null) Executing cmd: cloud-setup-agent  -m 172.16.2.2 
-z 2 -p 2 -c 2 -g 8224e6a1-e640-3bd1-b5d0-dd43c74b8b08 -a 
--pubNic=cloudbr0 --prvNic=cloudbr0 --guestNic=cloudbr0
2013-05-20 11:52:31,388 DEBUG 
[cloud.consoleproxy.ConsoleProxyManagerImpl] (consoleproxy-1:null) Skip 
capacity scan due to there is no Primary Storage UPintenance mode
2013-05-20 11:52:31,805 DEBUG [storage.snapshot.SnapshotSchedulerImpl] 
(SnapshotPollTask:null) Snapshot scheduler.poll is being called at 
2013-05-20 09:52:31 GMT
2013-05-20 11:52:31,806 DEBUG [storage.snapshot.SnapshotSchedulerImpl] 
(SnapshotPollTask:null) Got 0 snapshots to be executed at 2013-05-20 
09:52:31 GMT
2013-05-20 11:52:31,830 DEBUG 
[cloud.network.ExternalLoadBalancerUsageManagerImpl] 
(ExternalNetworkMonitor-1:null) External load balancer devices stats 
collector is running...
2013-05-20 11:52:31,868 DEBUG 
[network.router.VirtualNetworkApplianceManagerImpl] 
(RouterMonitor-1:null) Found 0 running routers.
2013-05-20 11:52:31,870 DEBUG 
[network.router.VirtualNetworkApplianceManagerImpl] 
(RouterStatusMonitor-1:null) Found 0 routers.
2013-05-20 11:52:41,440 DEBUG [utils.ssh.SSHCmdHelper] 
(catalina-exec-25:null) cloud-setup-agent  -m 172.16.2.2 -z 2 -p 2 -c 2 
-g 8224e6a1-e640-3bd1-b5d0-dd43c74b8b08 -a --pubNic=cloudbr0 
--prvNic=cloudbr0 --guestNic=cloudbr0 output:CloudStack Agent setup is done!
    Configure Cgroup ...
2013-05-20 11:52:46,100 DEBUG [cloud.server.StatsCollector] 
(StatsCollector-1:null) HostStatsCollector is running...
2013-05-20 11:52:46,100 DEBUG [cloud.server.StatsCollector] 
(StatsCollector-2:null) VmStatsCollector is running...
2013-05-20 11:52:46,114 DEBUG [cloud.server.StatsCollector] 
(StatsCollector-3:null) StorageCollector is running...
2013-05-20 11:53:01,388 DEBUG 
[cloud.consoleproxy.ConsoleProxyManagerImpl] (consoleproxy-1:null) Skip 
capacity scan due to there is no Primary Storage UPintenance mode
2013-05-20 11:53:01,793 DEBUG [cloud.alert.AlertManagerImpl] 
(CapacityChecker:null) Running Capacity Checker ...
2013-05-20 11:53:01,793 DEBUG [cloud.alert.AlertManagerImpl] 
(CapacityChecker:null) recalculating system capacity
2013-05-20 11:53:01,793 DEBUG [cloud.alert.AlertManagerImpl] 
(CapacityChecker:null) Executing cpu/ram capacity update
2013-05-20 11:53:01,794 DEBUG [cloud.alert.AlertManagerImpl] 
(CapacityChecker:null) Done executing cpu/ram capacity update
2013-05-20 11:53:01,794 DEBUG [cloud.alert.AlertManagerImpl] 
(CapacityChecker:null) Executing storage capacity update
2013-05-20 11:53:01,795 DEBUG [cloud.alert.AlertManagerImpl] 
(CapacityChecker:null) Done executing storage capacity update
2013-05-20 11:53:01,795 DEBUG [cloud.alert.AlertManagerImpl] 
(CapacityChecker:null) Executing capacity updates for public ip and Vlans
2013-05-20 11:53:01,803 DEBUG [cloud.alert.AlertManagerImpl] 
(CapacityChecker:null) Done capacity updates for public ip and Vlans
2013-05-20 11:53:01,803 DEBUG [cloud.alert.AlertManagerImpl] 
(CapacityChecker:null) Executing capacity updates for private ip
2013-05-20 11:53:01,807 DEBUG [cloud.alert.AlertManagerImpl] 
(CapacityChecker:null) Done executing capacity updates for private ip
2013-05-20 11:53:01,807 DEBUG [cloud.alert.AlertManagerImpl] 
(CapacityChecker:null) Done recalculating system capacity
2013-05-20 11:53:01,817 DEBUG [cloud.alert.AlertManagerImpl] 
(CapacityChecker:null) Done running Capacity Checker ...
2013-05-20 11:53:01,870 DEBUG 
[network.router.VirtualNetworkApplianceManagerImpl] 
(RouterStatusMonitor-1:null) Found 0 routers.
2013-05-20 11:53:31,388 DEBUG 
[cloud.consoleproxy.ConsoleProxyManagerImpl] (consoleproxy-1:null) Skip 
capacity scan due to there is no Primary Storage UPintenance mode
2013-05-20 11:53:31,870 DEBUG 
[network.router.VirtualNetworkApplianceManagerImpl] 
(RouterStatusMonitor-1:null) Found 0 routers.
2013-05-20 11:53:46,102 DEBUG [cloud.server.StatsCollector] 
(StatsCollector-1:null) HostStatsCollector is running...
2013-05-20 11:53:46,102 DEBUG [cloud.server.StatsCollector] 
(StatsCollector-2:null) VmStatsCollector is running...
2013-05-20 11:53:46,117 DEBUG [cloud.server.StatsCollector] 
(StatsCollector-3:null) StorageCollector is running...
( ... )
2013-05-20 11:57:31,806 DEBUG [storage.snapshot.SnapshotSchedulerImpl] 
(SnapshotPollTask:null) Snapshot scheduler.poll is being called at 
2013-05-20 09:57:31 GMT
2013-05-20 11:57:31,807 DEBUG [storage.snapshot.SnapshotSchedulerImpl] 
(SnapshotPollTask:null) Got 0 snapshots to be executed at 2013-05-20 
09:57:31 GMT
2013-05-20 11:57:31,830 DEBUG 
[cloud.network.ExternalLoadBalancerUsageManagerImpl] 
(ExternalNetworkMonitor-1:null) External load balancer devices stats 
collector is running...
2013-05-20 11:57:31,868 DEBUG 
[network.router.VirtualNetworkApplianceManagerImpl] 
(RouterMonitor-1:null) Found 0 running routers.
2013-05-20 11:57:31,870 DEBUG 
[network.router.VirtualNetworkApplianceManagerImpl] 
(RouterStatusMonitor-1:null) Found 0 routers.
2013-05-20 11:57:42,458 DEBUG [kvm.discoverer.KvmServerDiscoverer] 
(catalina-exec-25:null) Timeout, to wait for the host connecting to mgt 
svr, assuming it is failed
2013-05-20 11:57:42,461 WARN  [cloud.resource.ResourceManagerImpl] 
(catalina-exec-25:null) Unable to find the server resources at 
http://mnode-1
2013-05-20 11:57:42,464 WARN  [api.commands.AddHostCmd] 
(catalina-exec-25:null) Exception:
com.cloud.exception.DiscoveryException: Unable to add the host
     at 
com.cloud.resource.ResourceManagerImpl.discoverHostsFull(ResourceManagerImpl.java:737)
     at 
com.cloud.resource.ResourceManagerImpl.discoverHosts(ResourceManagerImpl.java:544)
     at com.cloud.api.commands.AddHostCmd.execute(AddHostCmd.java:140)
     at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:138)
     at com.cloud.api.ApiServer.queueCommand(ApiServer.java:544)
     at com.cloud.api.ApiServer.handleRequest(ApiServer.java:423)
     at com.cloud.api.ApiServlet.processRequest(ApiServlet.java:312)
     at com.cloud.api.ApiServlet.doGet(ApiServlet.java:64)
     at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
     at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
     at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
     at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
     at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
     at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
     at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
     at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
     at 
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:555)
     at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
     at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
     at 
org.apache.coyote.http11.Http11NioProcessor.process(Http11NioProcessor.java:889)
     at 
org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:721)
     at 
org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:2274)
     at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
     at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
     at java.lang.Thread.run(Thread.java:722)
2013-05-20 11:57:42,465 WARN  [cloud.api.ApiDispatcher] 
(catalina-exec-25:null) class com.cloud.api.ServerApiException : Unable 
to add the host
2013-05-20 11:57:42,466 DEBUG [cloud.api.ApiServlet] 
(catalina-exec-25:null) ===END===  192.168.1.187 -- GET 
command=addHost&zoneid=ac6c892c-8a9d-494b-8d3c-c612263059a0&podid=a4dad17d-f8dc-4734-a6b2-46954144b954&clusterid=b808e253-3aa2-4372-8520-84d8cda88c27&hypervisor=KVM&clustertype=CloudManaged&hosttags=&username=root&url=http%3A%2F%2Fmnode-1&response=json&sessionkey=%2BcU40ygVQMKpfKAI9il7SspVnBk%3D&_=1369039922245

===== END ====

agent.log in mnode-1:
==============
2013-05-20 10:48:55,603 ERROR [cloud.agent.AgentShell] (main:null) 
Unable to start agent: Failed to get public nic name
2013-05-20 11:04:28,473 INFO  [utils.component.ComponentLocator] 
(main:null) Unable to find components.xml
2013-05-20 11:04:28,474 INFO  [utils.component.ComponentLocator] 
(main:null) Skipping configuration using components.xml
2013-05-20 11:04:28,474 INFO  [cloud.agent.AgentShell] (main:null) 
Implementation Version is 4.0.2.20130420145617
2013-05-20 11:04:28,475 INFO  [cloud.agent.AgentShell] (main:null) 
agent.properties found at /etc/cloud/agent/agent.properties
2013-05-20 11:04:28,476 INFO  [cloud.agent.AgentShell] (main:null) 
Defaulting to using properties file for storage
2013-05-20 11:04:28,478 INFO  [cloud.agent.AgentShell] (main:null) 
Defaulting to the constant time backoff algorithm
2013-05-20 11:04:28,534 INFO  [cloud.agent.Agent] (main:null) id is
2013-05-20 11:04:28,544 ERROR [cloud.resource.ServerResourceBase] 
(main:null) Nics are not configured!
2013-05-20 11:04:28,550 INFO  [cloud.resource.ServerResourceBase] 
(main:null) Designating private to be nic em1.100
2013-05-20 11:04:28,638 INFO 
[resource.virtualnetwork.VirtualRoutingResource] (main:null) 
VirtualRoutingResource _scriptDir to use: scripts/network/domr/kvm
2013-05-20 11:04:28,838 ERROR [cloud.agent.AgentShell] (main:null) 
Unable to start agent: Failed to get public nic name

2013-05-20 11:52:05,347 INFO  [utils.component.ComponentLocator] 
(main:null) Unable to find components.xml
2013-05-20 11:52:05,348 INFO  [utils.component.ComponentLocator] 
(main:null) Skipping configuration using components.xml
2013-05-20 11:52:05,348 INFO  [cloud.agent.AgentShell] (main:null) 
Implementation Version is 4.0.2.20130420145617
2013-05-20 11:52:05,349 INFO  [cloud.agent.AgentShell] (main:null) 
agent.properties found at /etc/cloud/agent/agent.properties
2013-05-20 11:52:05,350 INFO  [cloud.agent.AgentShell] (main:null) 
Defaulting to using properties file for storage
2013-05-20 11:52:05,352 INFO  [cloud.agent.AgentShell] (main:null) 
Defaulting to the constant time backoff algorithm
2013-05-20 11:52:05,409 INFO  [cloud.agent.Agent] (main:null) id is
2013-05-20 11:52:05,418 ERROR [cloud.resource.ServerResourceBase] 
(main:null) Nics are not configured!
2013-05-20 11:52:05,424 INFO  [cloud.resource.ServerResourceBase] 
(main:null) Designating private to be nic em1.100
2013-05-20 11:52:05,513 INFO 
[resource.virtualnetwork.VirtualRoutingResource] (main:null) 
VirtualRoutingResource _scriptDir to use: scripts/network/domr/kvm
2013-05-20 11:52:05,712 ERROR [cloud.agent.AgentShell] (main:null) 
Unable to start agent: Failed to get public nic name

===== END ==

I have iptables disabled and ipsec set to permissive on both sides. I'm 
guessing there might be a problem with the bridge configuration in the 
kvm node (mnode-1). This is what I have at the moment:

network scripts:
===========

::::::::::::::
ifcfg-em1
::::::::::::::
DEVICE=em1
BOOTPROTO=none
BROADCAST=172.16.255.255
DNS1=192.168.1.204
GATEWAY=172.16.0.1
HWADDR=90:B1:1C:39:33:8A
IPADDR=172.16.2.3
NETMASK=255.255.0.0
NM_CONTROLLED=no
ONBOOT=yes
HOTPLUG=no
TYPE=Ethernet

::::::::::::::
ifcfg-em1.100
::::::::::::::
DEVICE=em1.100
HWADDR=90:B1:1C:39:33:8A
ONBOOT=yes
HOTPLUG=no
BOOTPROTO=none
TYPE=Ethernet
VLAN=yes
IPADDR=172.16.5.1
GATEWAY=172.16.0.1
NETMASK=255.255.0.0

::::::::::::::
ifcfg-em1.200
::::::::::::::
DEVICE=em1.200
HWADDR=90:B1:1C:39:33:8A
ONBOOT=yes
HOTPLUG=no
BOOTPROTO=none
TYPE=Ethernet
VLAN=yes
BRIDGE=cloudbr0

::::::::::::::
ifcfg-em1.300
::::::::::::::
DEVICE=em1.300
HWADDR=90:B1:1C:39:33:8A
ONBOOT=yes
HOTPLUG=no
BOOTPROTO=none
TYPE=Ethernet
VLAN=yes
BRIDGE=cloudbr1

::::::::::::::
ifcfg-cloudbr0
::::::::::::::
DEVICE=cloudbr0
TYPE=Bridge
ONBOOT=yes
BOOTPROTO=none
IPV6INIT=no
IPV6_AUTOCONF=no
DELAY=5
STP=yes

::::::::::::::
ifcfg-cloudbr1
::::::::::::::
DEVICE=cloudbr1
TYPE=Bridge
ONBOOT=yes
BOOTPROTO=none
IPV6INIT=no
IPV6_AUTOCONF=no
DELAY=5
STP=yes


ifconfig output:
===========

cloudbr0  Link encap:Ethernet  HWaddr 90:B1:1C:39:33:8A
           inet6 addr: fe80::92b1:1cff:fe39:338a/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
           TX packets:14 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:0 (0.0 b)  TX bytes:1188 (1.1 KiB)

cloudbr1  Link encap:Ethernet  HWaddr 90:B1:1C:39:33:8A
           inet6 addr: fe80::92b1:1cff:fe39:338a/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
           TX packets:14 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:0 (0.0 b)  TX bytes:1188 (1.1 KiB)

em1       Link encap:Ethernet  HWaddr 90:B1:1C:39:33:8A
           inet addr:172.16.2.3  Bcast:172.16.255.255 Mask:255.255.0.0
           inet6 addr: fe80::92b1:1cff:fe39:338a/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:89868 errors:0 dropped:0 overruns:0 frame:0
           TX packets:6620 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:20755835 (19.7 MiB)  TX bytes:479334 (468.0 KiB)
           Interrupt:16

em1.100   Link encap:Ethernet  HWaddr 90:B1:1C:39:33:8A
           inet addr:172.16.5.1  Bcast:172.16.255.255 Mask:255.255.0.0
           inet6 addr: fe80::92b1:1cff:fe39:338a/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
           TX packets:20 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:0 (0.0 b)  TX bytes:1440 (1.4 KiB)

em1.200   Link encap:Ethernet  HWaddr 90:B1:1C:39:33:8A
           inet6 addr: fe80::92b1:1cff:fe39:338a/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
           TX packets:3019 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:0 (0.0 b)  TX bytes:158774 (155.0 KiB)

em1.300   Link encap:Ethernet  HWaddr 90:B1:1C:39:33:8A
           inet6 addr: fe80::92b1:1cff:fe39:338a/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
           TX packets:3020 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:0 (0.0 b)  TX bytes:158844 (155.1 KiB)

lo        Link encap:Local Loopback
           inet addr:127.0.0.1  Mask:255.0.0.0
           inet6 addr: ::1/128 Scope:Host
           UP LOOPBACK RUNNING  MTU:16436  Metric:1
           RX packets:118 errors:0 dropped:0 overruns:0 frame:0
           TX packets:118 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:10544 (10.2 KiB)  TX bytes:10544 (10.2 KiB)

virbr0    Link encap:Ethernet  HWaddr 52:54:00:37:05:54
           inet addr:192.168.122.1  Bcast:192.168.122.255 Mask:255.255.255.0
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)



The current status of cloud-agent is "cloud-agent dead but subsys 
locked". Attempting to restart the service results in the same error(s) 
being printed out to the agent.log.

I ran the cloud-setup-agent command I found in management-server.log, 
and it seems to work fine, but still Im not able to bring the 
cloud-agent service up:

[root@mnode-1 network-scripts]# cloud-setup-agent  -m 172.16.2.2 -z 2 -p 
2 -c 2 -g 8224e6a1-e640-3bd1-b5d0-dd43c74b8b08 -a --pubNic=cloudbr0 
--prvNic=cloudbr0 --guestNic=cloudbr0
Starting to configure your system:
Configure Cgroup ...          [OK]
Configure SElinux ...         [OK]
Configure Network ...         [OK]
Configure Libvirt ...         [OK]
Configure Firewall ...        [OK]
Configure Nfs ...             [OK]
Configure cloudAgent ...      [OK]
CloudStack Agent setup is done!
[root@mnode-1 network-scripts]# service cloud-agent restart
Stopping Cloud Agent:
Starting Cloud Agent:
[root@mnode-1 network-scripts]# service cloud-agent status
cloud-agent dead but subsys locked


What am I doing wrong? Thanks in advance for the help.


-- 
kind regards,

Javier Rodríguez


Mime
View raw message