cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edison Su <Edison...@citrix.com>
Subject RE: Cloudstack agent keeps rebooting kvm host..
Date Fri, 22 Jun 2012 18:05:05 GMT
On KVM host:
sed -i  's/INFO/DEBUG/g' /etc/cloud/agent/log4j-cloud.xml

Disable reboot:
sed -i 's/reboot/#reboot/g' /usr/lib64/cloud/agent/scripts/vm/hypervisor/kvm/kvmheartbeat.sh

> -----Original Message-----
> From: Alexey Zilber [mailto:alexeyzilber@gmail.com]
> Sent: Friday, June 22, 2012 10:58 AM
> To: cloudstack-users@incubator.apache.org
> Subject: RE: Cloudstack agent keeps rebooting kvm host..
> 
> Hi Edison,
> 
>   I did that earlier, before I added the host.   It mounted perfectly.
> I
> will test it again in a bit after some sleep.
>   Is there a way to increase the debug level in the agent?
> 
> Thanks,
> Alex
> 
> -Alexey (sent via Android)
> On Jun 23, 2012 1:39 AM, "Edison Su" <Edison.su@citrix.com> wrote:
> 
> > Are you using NFS primary storage, right? The NFS primary storage
> will be
> > mounted at /mnt/9c2be815-de2b-3c14-84bb-54025d782794, after agent
> connected
> > to mgt server.
> > Then a NFS storage monitor is started, by writing a timestamp file
> into in
> > NFS primary storage. If it failed, that means NFS primary storage is
> not
> > usable.
> > How to diagnose the issue:
> > Mount primary storage on kvm host, check the permission of the mount
> > point, or just simply create a file under the mount point.
> > Usually, this error coming from NFS server setup. Please check the
> NFS
> > server setup, make sure primary storage work on kvm host, before
> adding it
> > mgt server.
> >
> > > -----Original Message-----
> > > From: Alexey Zilber [mailto:alexeyzilber@gmail.com]
> > > Sent: Friday, June 22, 2012 10:12 AM
> > > To: cloudstack-users@incubator.apache.org
> > > Subject: Re: Cloudstack agent keeps rebooting kvm host..
> > >
> > > Hi Sadhu,
> > >
> > >   /mnt isn't a mounted filesystem.  It's on the root filesystem.
> There
> > > should be no write errors, and I see it was able to create the main
> > > directory:
> > >
> > > [root@kvm1 mnt]# ls -altrh
> > > total 12K
> > > drwxr-xr-x   2 root root    6 Jun 22 22:14 kvm_primary_storage
> > > drwxr-xr-x.  4 root root 4.0K Jun 22 23:58 .
> > > drwxrwxrwx   3 root root 4.0K Jun 23 01:01
> > > 9c2be815-de2b-3c14-84bb-54025d782794
> > > dr-xr-xr-x. 24 root root 4.0K Jun 23 01:05 ..
> > >
> > > I even just created /mnt/9c2be815-de2b-3c14-84bb-
> 54025d782794/KVMHA/
> > > with
> > > full permissions and it still rebooted:
> > >
> > > 2012-06-23 01:09:23,830{GMT} INFO  [cloud.agent.Agent] (Agent-
> Handler-
> > > 2:)
> > > Startup Response Received: agent id = 5
> > > 2012-06-23 01:09:23,830 INFO  [cloud.agent.Agent] (Agent-Handler-
> 2:null)
> > > Startup Response Received: agent id = 5
> > > 2012-06-23 01:10:22,774{GMT} WARN  [resource.computing.KVMHAMonitor]
> > > (Thread-7:) write heartbeat failed: Failed to create
> > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
> retry: 0
> > > 2012-06-23 01:10:22,774 WARN  [resource.computing.KVMHAMonitor]
> > > (Thread-7:null) write heartbeat failed: Failed to create
> > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
> retry: 0
> > > 2012-06-23 01:10:22,797{GMT} WARN  [resource.computing.KVMHAMonitor]
> > > (Thread-7:) write heartbeat failed: Failed to create
> > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
> retry: 1
> > > 2012-06-23 01:10:22,797 WARN  [resource.computing.KVMHAMonitor]
> > > (Thread-7:null) write heartbeat failed: Failed to create
> > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
> retry: 1
> > > 2012-06-23 01:10:22,821{GMT} WARN  [resource.computing.KVMHAMonitor]
> > > (Thread-7:) write heartbeat failed: Failed to create
> > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
> retry: 2
> > > 2012-06-23 01:10:22,821 WARN  [resource.computing.KVMHAMonitor]
> > > (Thread-7:null) write heartbeat failed: Failed to create
> > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
> retry: 2
> > > 2012-06-23 01:10:22,843{GMT} WARN  [resource.computing.KVMHAMonitor]
> > > (Thread-7:) write heartbeat failed: Failed to create
> > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
> retry: 3
> > > 2012-06-23 01:10:22,843 WARN  [resource.computing.KVMHAMonitor]
> > > (Thread-7:null) write heartbeat failed: Failed to create
> > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
> retry: 3
> > > 2012-06-23 01:10:22,866{GMT} WARN  [resource.computing.KVMHAMonitor]
> > > (Thread-7:) write heartbeat failed: Failed to create
> > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
> retry: 4
> > > 2012-06-23 01:10:22,866 WARN  [resource.computing.KVMHAMonitor]
> > > (Thread-7:null) write heartbeat failed: Failed to create
> > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
> retry: 4
> > > 2012-06-23 01:10:22,867{GMT} WARN  [resource.computing.KVMHAMonitor]
> > > (Thread-7:) write heartbeat failed: Failed to create
> > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17;
> reboot
> > > the
> > > host
> > > 2012-06-23 01:10:22,867 WARN  [resource.computing.KVMHAMonitor]
> > > (Thread-7:null) write heartbeat failed: Failed to create
> > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17;
> reboot
> > > the
> > > host
> > >
> > > Broadcast message from root@kvm1.xxxx.xxx
> > >         (unknown) at 1:10 ...
> > >
> > > The system is going down for reboot NOW!
> > > Killing VMOps Agent (PID 4074) with SIGTERM
> > > Waiting for agent to exit
> > >
> > >
> > > -Alex
> > >
> > > On Sat, Jun 23, 2012 at 12:55 AM, Suresh Sadhu
> > > <Suresh.Sadhu@citrix.com>wrote:
> > >
> > > > HI Alex,
> > > >
> > > > When heartbeat fails ,host will reboot continuously till the
> problem
> > > > resolved(heartbeat successful)...
> > > >
> > > > The heartbeat failure might be caused due to fail to  write on
> > > mounted
> > > > storage path,
> > > > Did you see any permission denied messages in the logs ..and does
> > > your
> > > >  mounted storage paths has rw permissions after this problem.
> because
> > > due
> > > > some corruption in the mounted FS your  mounted file system might
> > > become
> > > > read-only. That might cause  heart-beat failure.
> > > >
> > > >
> > > >
> > > > Regards
> > > > Sadhu
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: Alexey Zilber [mailto:alexeyzilber@gmail.com]
> > > > Sent: 22 June 2012 21:52
> > > > To: cloudstack-users@incubator.apache.org
> > > > Subject: Cloudstack agent keeps rebooting kvm host..
> > > >
> > > > Hi,
> > > >
> > > >  The saga continues!  I added a KVM host.  The agent decided it
> wants
> > > to
> > > > constantly reboot the server:
> > > >
> > > > 2012-06-23 00:11:32,083{GMT} INFO  [cloud.agent.Agent] (Agent-
> > > Handler-2:)
> > > > Startup Response Received: agent id = 5
> > > > 2012-06-23 00:11:32,083 INFO  [cloud.agent.Agent] (Agent-Handler-
> > > 2:null)
> > > > Startup Response Received: agent id = 5
> > > > 2012-06-23 00:12:30,187{GMT} WARN
> [resource.computing.KVMHAMonitor]
> > > > (Thread-7:) write heartbeat failed: Failed to create
> > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
> retry:
> > > 0
> > > > 2012-06-23 00:12:30,187 WARN  [resource.computing.KVMHAMonitor]
> > > > (Thread-7:null) write heartbeat failed: Failed to create
> > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
> retry:
> > > 0
> > > > 2012-06-23 00:12:30,209{GMT} WARN
> [resource.computing.KVMHAMonitor]
> > > > (Thread-7:) write heartbeat failed: Failed to create
> > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
> retry:
> > > 1
> > > > 2012-06-23 00:12:30,209 WARN  [resource.computing.KVMHAMonitor]
> > > > (Thread-7:null) write heartbeat failed: Failed to create
> > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
> retry:
> > > 1
> > > > 2012-06-23 00:12:30,232{GMT} WARN
> [resource.computing.KVMHAMonitor]
> > > > (Thread-7:) write heartbeat failed: Failed to create
> > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
> retry:
> > > 2
> > > > 2012-06-23 00:12:30,232 WARN  [resource.computing.KVMHAMonitor]
> > > > (Thread-7:null) write heartbeat failed: Failed to create
> > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
> retry:
> > > 2
> > > > 2012-06-23 00:12:30,254{GMT} WARN
> [resource.computing.KVMHAMonitor]
> > > > (Thread-7:) write heartbeat failed: Failed to create
> > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
> retry:
> > > 3
> > > > 2012-06-23 00:12:30,254 WARN  [resource.computing.KVMHAMonitor]
> > > > (Thread-7:null) write heartbeat failed: Failed to create
> > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
> retry:
> > > 3
> > > > 2012-06-23 00:12:30,275{GMT} WARN
> [resource.computing.KVMHAMonitor]
> > > > (Thread-7:) write heartbeat failed: Failed to create
> > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
> retry:
> > > 4
> > > > 2012-06-23 00:12:30,275 WARN  [resource.computing.KVMHAMonitor]
> > > > (Thread-7:null) write heartbeat failed: Failed to create
> > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
> retry:
> > > 4
> > > > 2012-06-23 00:12:30,275{GMT} WARN
> [resource.computing.KVMHAMonitor]
> > > > (Thread-7:) write heartbeat failed: Failed to create
> > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17;
> reboot
> > > the
> > > > host
> > > > 2012-06-23 00:12:30,275 WARN  [resource.computing.KVMHAMonitor]
> > > > (Thread-7:null) write heartbeat failed: Failed to create
> > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17;
> reboot
> > > the
> > > > host
> > > >
> > > > Broadcast message from root@kvm1.xxxxx.xxxx
> > > >        (unknown) at 0:12 ...
> > > >
> > > > The system is going down for reboot NOW!
> > > >
> > > > It looks like the agent was in fact, at least able to create the
> > > initial
> > > > directory:
> > > >
> > > > [root@kvm1 ~]# ls -al /mnt/9c2be815-de2b-3c14-84bb-54025d782794
> > > > total 8
> > > > drwxrwxrwx  2 root root 4096 Jun 22 23:58 .
> > > > drwxr-xr-x. 4 root root 4096 Jun 22 23:58 ..
> > > >
> > > > Here's the agent properties file:
> > > >
> > > > #Storage
> > > > #Sat Jun 23 00:11:32 MYT 2012
> > > > guest.network.device=cloudbr0
> > > > workers=5
> > > > private.network.device=cloudbr0
> > > > port=8250
> > > >
> resource=com.cloud.agent.resource.computing.LibvirtComputingResource
> > > > pod=1
> > > > zone=1
> > > > guid=0f0f4f5c-99d0-3813-a7a6-00248cdfd17e
> > > > cluster=2
> > > > public.network.device=cloudbr0
> > > > local.storage.uuid=fbefb2ea-f3e0-4f02-96cb-1b8abb6e8c54
> > > > host=10.1.1.18
> > > > LibvirtComputingResource.id=5
> > > >
> > > >
> > > > First time I'm seeing this error...  Last time my kvm setup went
> well,
> > > but
> > > > KVM was my first hypervisor, now it's the second.
> > > >
> > > > Thanks!
> > > > Alex
> > > >
> >

Mime
View raw message