cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prasanna Santhanam (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CLOUDSTACK-4499) Xen6.1/Xen6.2 hosts initially transition to 'Alert' and then to 'Up' after addHost
Date Mon, 26 Aug 2013 08:41:53 GMT
Prasanna Santhanam created CLOUDSTACK-4499:
----------------------------------------------

             Summary: Xen6.1/Xen6.2 hosts initially transition to 'Alert' and then to 'Up'
after addHost
                 Key: CLOUDSTACK-4499
                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4499
             Project: CloudStack
          Issue Type: Bug
      Security Level: Public (Anyone can view this level - this is the default.)
            Reporter: Prasanna Santhanam
            Priority: Critical


The same bug was reported in CLOUDSTACK-3839 which has been determined to be a
UI issue but the problem is with the addHost API and Xenserver hosts 6.1/6.2. 

When a freshly provisioned 6.1/6.2 host is added to cloudstack it initially
goes into 'Alert' state thereby failing the storage pool creation subsequently.
StoragePool addition requires the hosts to be in Up state. But a minute or two
later the Xenserver host automatically moves back to Up state after which
storage pool addition can occur

When the host is added to cloudstack:
2013-08-26 06:46:22,965 DEBUG [agent.manager.DirectAgentAttache] (DirectAgent-1:null) Seq
1-1916403719: Executing request
2013-08-26 06:46:23,261 DEBUG [xen.resource.CitrixResourceBase] (DirectAgent-1:null) Can't
find a vif on dom0 for link local, creating a new one
2013-08-26 06:46:23,274 DEBUG [xen.resource.CitrixResourceBase] (DirectAgent-1:null) Lowest
available Vif device number: 0 for VM: Control domain on host: apache-81-3
2013-08-26 06:46:23,600 WARN  [xen.resource.CitrixResourceBase] (DirectAgent-1:null) Unable
to create local link network
The server failed to handle your request, due to an internal error.  The given message may
give details useful for debugging the problem.
    at com.xensource.xenapi.Types.checkResponse(Types.java:1694)
    at com.xensource.xenapi.Connection.dispatch(Connection.java:368)
    at com.cloud.hypervisor.xen.resource.XenServerConnectionPool$XenServerConnection.dispatch(XenServerConnectionPool.java:909)
    at com.xensource.xenapi.VIF.plug(VIF.java:846)


Host marked Alert:
2013-08-26 06:46:23,616 DEBUG [agent.manager.AgentManagerImpl] (catalina-exec-14:null) Sending
Disconnect to listener: com.cloud.network.NetworkUsageManagerImpl$DirectNetworkStatsListener
2013-08-26 06:46:23,616 DEBUG [cloud.network.NetworkUsageManagerImpl] (catalina-exec-14:null)
Disconnected called on 1 with status Alert
2013-08-26 06:46:23,616 DEBUG [agent.manager.AgentManagerImpl] (catalina-exec-14:null) Sending
Disconnect to listener: com.cloud.consoleproxy.ConsoleProxyListener
2013-08-26 06:46:23,619 DEBUG [cloud.host.Status] (catalina-exec-14:null) Transition:[Resource
state = Enabled, Agent event = AgentDisconnected, Host id = 1, name = apache-81-3]
2013-08-26 06:46:23,625 DEBUG [cloud.host.Status] (catalina-exec-14:null) Agent status update:
[id = 1; name = apache-81-3; old status = Connecting; event = AgentDisconnected; new status
= Alert; old update count = 1; new update count = 2]

Automatic retry by CloudStack a few moments later:
2013-08-26 06:46:52,752 DEBUG [host.dao.HostDaoImpl] (ClusteredAgentManager Timer:null) Completed
acquiring hosts for clusters not owned by any management server
2013-08-26 06:46:52,756 DEBUG [agent.manager.ClusteredAgentManagerImpl] (ClusteredAgentManager
Timer:null) Found 2 unmanaged direct hosts, processing connect for them...
2013-08-26 06:46:52,757 DEBUG [agent.manager.ClusteredAgentManagerImpl] (ClusteredAgentManager
Timer:null) Loading directly connected host 1(apache-81-3)

Meanwhile Storage Pool addition fails:
2013-08-26 06:46:52,766 DEBUG [cloud.api.ApiServlet] (catalina-exec-17:null) ===END===  10.208.8.5
-- GET  username=root&apiKey=sLUjl2n1c33JJpY4ZzMtObUgp9Ah2xW5inofCclZQQ1V6xp5k3CmAsGhemLwQxVvQY67EiopM_fbWBoJRKCIww&podid=fdf30680-d5c7-425e-bd7f-fb8c52e7cb96&hypervisor=XenServer&clusterid=10ff9a88-54bc-405b-9b4b-c26acf9caca8&zoneid=0431023e-ff78-43b6-a32d-262f36a78cbb&command=addHost&url=http%3A%2F%2Fapache-81-2&signature=oJqOno9flCtPMSSHH%2FZD%2BJl5QXw%3D&response=json
2013-08-26 06:46:52,773 DEBUG [cloud.api.ApiServlet] (catalina-exec-16:null) ===START=== 
10.208.8.5 -- GET  apiKey=sLUjl2n1c33JJpY4ZzMtObUgp9Ah2xW5inofCclZQQ1V6xp5k3CmAsGhemLwQxVvQY67EiopM_fbWBoJRKCIww&name=z0p0c0ps0&url=nfs%3A%2F%2Fnfs.fmt.vmops.com%3A%2Fexport%2Fautomation%2Facs%2Fprimary&podid=fdf30680-d5c7-425e-bd7f-fb8c52e7cb96&clusterid=10ff9a88-54bc-405b-9b4b-c26acf9caca8&zoneid=0431023e-ff78-43b6-a32d-262f36a78cbb&command=createStoragePool&signature=70CuFbmf58PU4X0QbF1tKbBGT60%3D&response=json
2013-08-26 06:46:52,787 DEBUG [cloud.network.NetworkModelImpl] (ClusteredAgentManager Timer:null)
Failed to retrive the default label for storage traffic:zone: 1 hypervisor: XenServer due
to:Unable to find the default physical network with traffic=Storage in the specified zone
id
2013-08-26 06:46:52,839 DEBUG [datastore.lifecycle.CloudStackPrimaryDataStoreLifeCycleImpl]
(catalina-exec-16:null) createPool Params @ scheme - nfs storageHost - nfs.fmt.vmops.com hostPath
- /export/automation/acs/primary port - -1
2013-08-26 06:46:52,889 DEBUG [agent.manager.ClusteredAgentManagerImpl] (ClusteredAgentManager
Timer:null) Loading directly connected host 2(apache-81-2)
2013-08-26 06:46:52,892 DEBUG [agent.manager.AgentManagerImpl] (AgentTaskPool-2:null) Simulating
start for resource 192267e4-d97d-4e37-b899-82b157e3166f id 1
2013-08-26 06:46:52,900 DEBUG [cloud.storage.StorageManagerImpl] (catalina-exec-16:null) Failed
to add data store
com.cloud.utils.exception.CloudRuntimeException: No host up to associate a storage pool with
in cluster 1
    at org.apache.cloudstack.storage.datastore.lifecycle.CloudStackPrimaryDataStoreLifeCycleImpl.attachCluster(CloudStackPrimaryDataStoreLifeCycleImpl.java:371)
    at com.cloud.storage.StorageManagerImpl.createPool(StorageManagerImpl.java:749)
    at com.cloud.storage.StorageManagerImpl.createPool(StorageManagerImpl.java:177)


Host comes back to Up state:
2013-08-26 06:46:53,898 DEBUG [agent.manager.DirectAgentAttache] (DirectAgent-4:null) Seq
1-1519321095: Executing request
2013-08-26 06:46:54,279 DEBUG [xen.resource.CitrixResourceBase] (DirectAgent-4:null) already
have a vif on dom0 for link local network
2013-08-26 06:46:54,703 DEBUG [agent.transport.Request] (AgentTaskPool-2:null) Seq -1--1:
Startup request from directly connected host:  { Cmd , MgmtId: -1, via: -1, Ver: v1, Flags:
11, [{"cpus":4,"speed":2261,"memory":16190149248,"dom0MinMemory":588251136,"poolSync":false,"_clusterVMStates":{},"caps":"xen-3.0-x86_64
, xen-3.0-x86_32p , hvm-3.0-x86_32 , hvm-3.0-x86_32p , hvm-3.0-x86_64","pool":"9dfa936a-30f7-b5d1-4d4e-b5a75facb313","hypervisorType":"XenServer","hostDetails":{"product_version":"6.1.0","private.network.device":"Pool-wide
network associated with eth0","com.cloud.network.Networks.RouterPrivateIpStrategy":"DcGlobal","product_brand":"XenServer","product_version_text_short":"6.1","can_bridge_firewall":"false"},"hypervisorVersion":"6.1.0","type":"Routing","dataCenter":"1","pod":"1","cluster":"1","guid":"192267e4-d97d-4e37-b899-82b157e3166f","name":"apache-81-3","version":"4.2.0","iqn":"iqn.2013-08.com.vmops.fmt:691cf4d5","publicIpAddress":"10.208.8.69","publicNetmask":"255.255.255.192","publicMacAddress":"b8:ac:6f:94:f1:67","privateIpAddress":"10.208.8.69","privateMacAddress":"b8:ac:6f:94:f1:67","privateNetmask":"255.255.255.192","storageIpAddress":"10.208.8.69","storageNetmask":"255.255.255.192","storageMacAddress":"b8:ac:6f:94:f1:67","wait":0},{"totalSize":0,"poolInfo":{"uuid":"ae248ed7-dd2a-cfc8-53f7-f43661c596c8","host":"10.208.8.69","localPath":"lvm","hostPath":"lvm","poolType":"LVM","capacityBytes":491505319936,"availableBytes":491501125632},"resourceType":"STORAGE_POOL","hostDetails":{},"type":"Storage","dataCenter":"1","guid":"192267e4-d97d-4e37-b899-82b157e3166f","wait":0}]
}
2013-08-26 06:46:54,709 DEBUG [cloud.resource.ResourceManagerImpl] (AgentTaskPool-2:null)
Dispatching resource state event CREATE_HOST_VO_FOR_DIRECT_CONNECT to BaremetalPxeManagerImpl_EnhancerByCloudStack_d79b93ae
2013-08-26 06:46:54,709 DEBUG [cloud.resource.ResourceManagerImpl] (AgentTaskPool-2:null)
Dispatching resource state event CREATE_HOST_VO_FOR_DIRECT_CONNECT to PremiumSecondaryStorageManagerImpl_EnhancerByCloudStack_b51c5fde
2013-08-26 06:46:54,709 DEBUG [cloud.resource.ResourceManagerImpl] (AgentTaskPool-2:null)
Dispatching resource state event CREATE_HOST_VO_FOR_DIRECT_CONNECT to LxcServerDiscoverer_EnhancerByCloudStack_449ab3e6
2013-08-26 06:46:54,709 DEBUG [cloud.resource.ResourceManagerImpl] (AgentTaskPool-2:null)
Dispatching resource state event CREATE_HOST_VO_FOR_DIRECT_CONNECT to NiciraNvp
2013-08-26 06:46:54,709 DEBUG [cloud.resource.ResourceManagerImpl] (AgentTaskPool-2:null)
Dispatching resource state event CREATE_HOST_VO_FOR_DIRECT_CONNECT to OvmDiscoverer_EnhancerByCloudStack_ec7ad054
2013-08-26 06:46:54,710 DEBUG [cloud.resource.ResourceManagerImpl] (AgentTaskPool-2:null)
Dispatching resource state event CREATE_HOST_VO_FOR_DIRECT_CONNECT to XcpServerDiscoverer_EnhancerByCloudStack_831158ad
2013-08-26 06:46:54,710 INFO  [xen.discoverer.XcpServerDiscoverer] (AgentTaskPool-2:null)
Host: apache-81-3 connected with hypervisor type: XenServer. Checking CIDR...
2013-08-26 06:46:54,735 DEBUG [cloud.resource.ResourceState] (AgentTaskPool-2:null) Resource
state update: [id = 1; name = apache-81-3; old state = Enabled; event = InternalCreated; new
state = Enabled]
2013-08-26 06:46:54,736 DEBUG [cloud.host.Status] (AgentTaskPool-2:null) Transition:[Resource
state = Enabled, Agent event = AgentConnected, Host id = 1, name = apache-81-3]
2013-08-26 06:46:54,742 DEBUG [cloud.host.Status] (AgentTaskPool-2:null) Agent status update:
[id = 1; name = apache-81-3; old status = Connecting; event = AgentConnected; new status =
Connecting; old update count = 3; new update count = 4]
2013-08-26 06:46:54,946 DEBUG [cloud.host.Status] (AgentTaskPool-1:null) Agent status update:
[id = 1; name = apache-81-3; old status = Connecting; event = Ready; new status = Up; old
update count = 4; new update count = 5]



There appear to be two problems:
1. addHost API returns successfully even if the host has transitioned to 'Alert' state
2. Something wrong with Xenserver 6.2/6.1 hosts that makes this happen


This is on the latest off the 4.2 branch (8689f7f).

Attaching management server log and Database Dump.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message