cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nik Martin <nik.mar...@nfinausa.com>
Subject Storage failure in not handled well in CS
Date Tue, 02 Oct 2012 20:12:23 GMT
I have two SANs connected to CS as primary storage.  One is an HD based 
SAN, with a single target and LUN, and the other is an SSD SAN split 
into two volumes, each connected with a target and LUN.  The HD san is 
where all system VMs are stored (or they were before I added the HD SAN, 
but I have no ide where the system vm volumens are stored).  This 
morning, I had to do a semi emergency shutdown of the SSD SAN, so I put 
both LUNS in emergency maintenance mode in CS.  CS shutdown the entire 
cloud, not just the volumes stored in the SSD san.  The san is offline, 
and CS shows it in maintenance mode, but NO vm's will start, and the cs 
management log shows:

onnecting; event = AgentDisconnected; new status = Alert; old update 
count = 959; new update count = 960]
2012-10-02 15:10:40,370 DEBUG [agent.manager.ClusteredAgentManagerImpl] 
(AgentTaskPool-2:null) Notifying other nodes of to disconnect
2012-10-02 15:10:40,370 WARN  [cloud.resource.ResourceManagerImpl] 
(AgentTaskPool-2:null) Unable to connect due to
com.cloud.exception.ConnectionException: Unable to connect to pool 
Pool[204|IscsiLUN]
	at
	at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
	at java.lang.Thread.run(Thread.java:679)
Caused by: com.cloud.exception.StorageUnavailableException: Resource 
[StoragePool:204] is unreachable: Unable establish connection from 
storage head to storage pool 204 due to ModifyStoragePoolCommand add 
XenAPIException:Can not see storage pool: 
cfd3b016-d4d9-3bb9-b1f9-f31374c44185 from on 
host:82cad07f-6fbc-464e-86fe-28bb4af4bbcd 
host:82cad07f-6fbc-464e-86fe-28bb4af4bbcd pool: 
172.16.10.15/iqn.2012-01:com.nfinausa.san2:mirror0/0
	at 
com.cloud.storage.StorageManagerImpl.connectHostToSharedPool(StorageManagerImpl.java:1567)
	at 
com.cloud.storage.listener.StoragePoolMonitor.processConnect(StoragePoolMonitor.java:88)
	... 8 more
2012-10-02 15:10:40,371 DEBUG [cloud.host.Status] (AgentTaskPool-2:null) 
Transition:[Resource state = Enabled, Agent event = AgentDisconnected, 
Host id = 6, name = hv1]
2012-10-02 15:10:40,375 DEBUG [cloud.host.Status] (AgentTaskPool-2:null) 
Agent status update: [id = 6; name = hv1; old status = Alert; event = 
AgentDisconnected; new status = Alert; old update count = 960; new 
update count = 961]


host:82cad07f-6fbc-464e-86fe-28bb4af4bbcd pool: 
172.16.10.15/iqn.2012-01:com.nfinausa.san2:mirror0/1 is the SAN that is 
in maintenance mode, so why is CS still trying to connect?  All my HVs 
are in alert state becasue of this.

-- 
Regards,

Nik

Nik Martin
VP Business Development
Nfina Technologies, Inc.
+1.251.243.0043 x1003
Relentless Reliability

Mime
View raw message