Return-Path: X-Original-To: apmail-incubator-cloudstack-users-archive@minotaur.apache.org Delivered-To: apmail-incubator-cloudstack-users-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1A3D099C8 for ; Fri, 22 Jun 2012 18:05:44 +0000 (UTC) Received: (qmail 25956 invoked by uid 500); 22 Jun 2012 18:05:43 -0000 Delivered-To: apmail-incubator-cloudstack-users-archive@incubator.apache.org Received: (qmail 25920 invoked by uid 500); 22 Jun 2012 18:05:43 -0000 Mailing-List: contact cloudstack-users-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cloudstack-users@incubator.apache.org Delivered-To: mailing list cloudstack-users@incubator.apache.org Received: (qmail 25912 invoked by uid 99); 22 Jun 2012 18:05:43 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Jun 2012 18:05:43 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of Edison.su@citrix.com designates 66.165.176.63 as permitted sender) Received: from [66.165.176.63] (HELO SMTP02.CITRIX.COM) (66.165.176.63) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Jun 2012 18:05:39 +0000 X-IronPort-AV: E=Sophos;i="4.77,459,1336363200"; d="scan'208";a="199737101" Received: from sjcpmailmx01.citrite.net ([10.216.14.74]) by FTLPIPO02.CITRIX.COM with ESMTP/TLS/RC4-MD5; 22 Jun 2012 14:05:06 -0400 Received: from SJCPMAILBOX01.citrite.net ([10.216.4.73]) by SJCPMAILMX01.citrite.net ([10.216.14.74]) with mapi; Fri, 22 Jun 2012 11:05:05 -0700 From: Edison Su To: "cloudstack-users@incubator.apache.org" , "'alexeyzilber@gmail.com'" Date: Fri, 22 Jun 2012 11:05:05 -0700 Subject: RE: Cloudstack agent keeps rebooting kvm host.. Thread-Topic: Cloudstack agent keeps rebooting kvm host.. Thread-Index: Ac1QoJOUJBAeWOKXRre0reFF9PAA3QAAAf8g Message-ID: References: <67EF18FDCA335F489B366120481AB6C5EE3AC0D49E@BANPMAILBOX01.citrite.net> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org On KVM host: sed -i 's/INFO/DEBUG/g' /etc/cloud/agent/log4j-cloud.xml Disable reboot: sed -i 's/reboot/#reboot/g' /usr/lib64/cloud/agent/scripts/vm/hypervisor/kv= m/kvmheartbeat.sh > -----Original Message----- > From: Alexey Zilber [mailto:alexeyzilber@gmail.com] > Sent: Friday, June 22, 2012 10:58 AM > To: cloudstack-users@incubator.apache.org > Subject: RE: Cloudstack agent keeps rebooting kvm host.. >=20 > Hi Edison, >=20 > I did that earlier, before I added the host. It mounted perfectly. > I > will test it again in a bit after some sleep. > Is there a way to increase the debug level in the agent? >=20 > Thanks, > Alex >=20 > -Alexey (sent via Android) > On Jun 23, 2012 1:39 AM, "Edison Su" wrote: >=20 > > Are you using NFS primary storage, right? The NFS primary storage > will be > > mounted at /mnt/9c2be815-de2b-3c14-84bb-54025d782794, after agent > connected > > to mgt server. > > Then a NFS storage monitor is started, by writing a timestamp file > into in > > NFS primary storage. If it failed, that means NFS primary storage is > not > > usable. > > How to diagnose the issue: > > Mount primary storage on kvm host, check the permission of the mount > > point, or just simply create a file under the mount point. > > Usually, this error coming from NFS server setup. Please check the > NFS > > server setup, make sure primary storage work on kvm host, before > adding it > > mgt server. > > > > > -----Original Message----- > > > From: Alexey Zilber [mailto:alexeyzilber@gmail.com] > > > Sent: Friday, June 22, 2012 10:12 AM > > > To: cloudstack-users@incubator.apache.org > > > Subject: Re: Cloudstack agent keeps rebooting kvm host.. > > > > > > Hi Sadhu, > > > > > > /mnt isn't a mounted filesystem. It's on the root filesystem. > There > > > should be no write errors, and I see it was able to create the main > > > directory: > > > > > > [root@kvm1 mnt]# ls -altrh > > > total 12K > > > drwxr-xr-x 2 root root 6 Jun 22 22:14 kvm_primary_storage > > > drwxr-xr-x. 4 root root 4.0K Jun 22 23:58 . > > > drwxrwxrwx 3 root root 4.0K Jun 23 01:01 > > > 9c2be815-de2b-3c14-84bb-54025d782794 > > > dr-xr-xr-x. 24 root root 4.0K Jun 23 01:05 .. > > > > > > I even just created /mnt/9c2be815-de2b-3c14-84bb- > 54025d782794/KVMHA/ > > > with > > > full permissions and it still rebooted: > > > > > > 2012-06-23 01:09:23,830{GMT} INFO [cloud.agent.Agent] (Agent- > Handler- > > > 2:) > > > Startup Response Received: agent id =3D 5 > > > 2012-06-23 01:09:23,830 INFO [cloud.agent.Agent] (Agent-Handler- > 2:null) > > > Startup Response Received: agent id =3D 5 > > > 2012-06-23 01:10:22,774{GMT} WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: 0 > > > 2012-06-23 01:10:22,774 WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:null) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: 0 > > > 2012-06-23 01:10:22,797{GMT} WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: 1 > > > 2012-06-23 01:10:22,797 WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:null) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: 1 > > > 2012-06-23 01:10:22,821{GMT} WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: 2 > > > 2012-06-23 01:10:22,821 WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:null) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: 2 > > > 2012-06-23 01:10:22,843{GMT} WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: 3 > > > 2012-06-23 01:10:22,843 WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:null) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: 3 > > > 2012-06-23 01:10:22,866{GMT} WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: 4 > > > 2012-06-23 01:10:22,866 WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:null) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: 4 > > > 2012-06-23 01:10:22,867{GMT} WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17; > reboot > > > the > > > host > > > 2012-06-23 01:10:22,867 WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:null) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17; > reboot > > > the > > > host > > > > > > Broadcast message from root@kvm1.xxxx.xxx > > > (unknown) at 1:10 ... > > > > > > The system is going down for reboot NOW! > > > Killing VMOps Agent (PID 4074) with SIGTERM > > > Waiting for agent to exit > > > > > > > > > -Alex > > > > > > On Sat, Jun 23, 2012 at 12:55 AM, Suresh Sadhu > > > wrote: > > > > > > > HI Alex, > > > > > > > > When heartbeat fails ,host will reboot continuously till the > problem > > > > resolved(heartbeat successful)... > > > > > > > > The heartbeat failure might be caused due to fail to write on > > > mounted > > > > storage path, > > > > Did you see any permission denied messages in the logs ..and does > > > your > > > > mounted storage paths has rw permissions after this problem. > because > > > due > > > > some corruption in the mounted FS your mounted file system might > > > become > > > > read-only. That might cause heart-beat failure. > > > > > > > > > > > > > > > > Regards > > > > Sadhu > > > > > > > > > > > > -----Original Message----- > > > > From: Alexey Zilber [mailto:alexeyzilber@gmail.com] > > > > Sent: 22 June 2012 21:52 > > > > To: cloudstack-users@incubator.apache.org > > > > Subject: Cloudstack agent keeps rebooting kvm host.. > > > > > > > > Hi, > > > > > > > > The saga continues! I added a KVM host. The agent decided it > wants > > > to > > > > constantly reboot the server: > > > > > > > > 2012-06-23 00:11:32,083{GMT} INFO [cloud.agent.Agent] (Agent- > > > Handler-2:) > > > > Startup Response Received: agent id =3D 5 > > > > 2012-06-23 00:11:32,083 INFO [cloud.agent.Agent] (Agent-Handler- > > > 2:null) > > > > Startup Response Received: agent id =3D 5 > > > > 2012-06-23 00:12:30,187{GMT} WARN > [resource.computing.KVMHAMonitor] > > > > (Thread-7:) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: > > > 0 > > > > 2012-06-23 00:12:30,187 WARN [resource.computing.KVMHAMonitor] > > > > (Thread-7:null) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: > > > 0 > > > > 2012-06-23 00:12:30,209{GMT} WARN > [resource.computing.KVMHAMonitor] > > > > (Thread-7:) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: > > > 1 > > > > 2012-06-23 00:12:30,209 WARN [resource.computing.KVMHAMonitor] > > > > (Thread-7:null) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: > > > 1 > > > > 2012-06-23 00:12:30,232{GMT} WARN > [resource.computing.KVMHAMonitor] > > > > (Thread-7:) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: > > > 2 > > > > 2012-06-23 00:12:30,232 WARN [resource.computing.KVMHAMonitor] > > > > (Thread-7:null) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: > > > 2 > > > > 2012-06-23 00:12:30,254{GMT} WARN > [resource.computing.KVMHAMonitor] > > > > (Thread-7:) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: > > > 3 > > > > 2012-06-23 00:12:30,254 WARN [resource.computing.KVMHAMonitor] > > > > (Thread-7:null) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: > > > 3 > > > > 2012-06-23 00:12:30,275{GMT} WARN > [resource.computing.KVMHAMonitor] > > > > (Thread-7:) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: > > > 4 > > > > 2012-06-23 00:12:30,275 WARN [resource.computing.KVMHAMonitor] > > > > (Thread-7:null) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: > > > 4 > > > > 2012-06-23 00:12:30,275{GMT} WARN > [resource.computing.KVMHAMonitor] > > > > (Thread-7:) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17; > reboot > > > the > > > > host > > > > 2012-06-23 00:12:30,275 WARN [resource.computing.KVMHAMonitor] > > > > (Thread-7:null) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17; > reboot > > > the > > > > host > > > > > > > > Broadcast message from root@kvm1.xxxxx.xxxx > > > > (unknown) at 0:12 ... > > > > > > > > The system is going down for reboot NOW! > > > > > > > > It looks like the agent was in fact, at least able to create the > > > initial > > > > directory: > > > > > > > > [root@kvm1 ~]# ls -al /mnt/9c2be815-de2b-3c14-84bb-54025d782794 > > > > total 8 > > > > drwxrwxrwx 2 root root 4096 Jun 22 23:58 . > > > > drwxr-xr-x. 4 root root 4096 Jun 22 23:58 .. > > > > > > > > Here's the agent properties file: > > > > > > > > #Storage > > > > #Sat Jun 23 00:11:32 MYT 2012 > > > > guest.network.device=3Dcloudbr0 > > > > workers=3D5 > > > > private.network.device=3Dcloudbr0 > > > > port=3D8250 > > > > > resource=3Dcom.cloud.agent.resource.computing.LibvirtComputingResource > > > > pod=3D1 > > > > zone=3D1 > > > > guid=3D0f0f4f5c-99d0-3813-a7a6-00248cdfd17e > > > > cluster=3D2 > > > > public.network.device=3Dcloudbr0 > > > > local.storage.uuid=3Dfbefb2ea-f3e0-4f02-96cb-1b8abb6e8c54 > > > > host=3D10.1.1.18 > > > > LibvirtComputingResource.id=3D5 > > > > > > > > > > > > First time I'm seeing this error... Last time my kvm setup went > well, > > > but > > > > KVM was my first hypervisor, now it's the second. > > > > > > > > Thanks! > > > > Alex > > > > > >