cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Haijiao <18602198...@163.com>
Subject Hosts can not connect to secondary storage (NFS)
Date Thu, 29 Sep 2016 04:36:46 GMT
Hi, Devs

 We have a small production environment consists of 5 hosts (KVM, Ubuntu 14.04) and the secondary
storage is NFS running on an separated management host.

 Days ago,  we wrongly put one host in 'maintenance'  which caused all the VMs running on
that host to migrate to other available hosts.  but these hosts turned into 'alert' or 'disconnected'
state on ACS UI, and meanwhile from the kernel log, we can see the repeated message ' kernel:
[3270144.284365] nfs: server 10.226.32.4 not responding, timed out' .

 It seems all the hosts can not mount or unmount the NFS storage.  We have to use 'unmount
-lf' to forcely unmount the NFS and get the host state back to normal by restarting the libivrt
and cloudstack agent.  But the issue still sits there, all the hosts can not mount NFS with
the solid error 'nfs: server 10.226.32.4 not responding, timed out'.

 To isolate this issue,  we added a fresh new host into the environment,  it can communicate
with NFS with no problem. So the issue seems only happens with the existing 5 hosts.   We
guess it could be fixed by restarting the hosts but we can not afford that as of now since
they are all running production apps now.

 

Can anyone share some advice or hints to get the secondary storage back?   Thanks a lot !


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message