Return-Path: X-Original-To: apmail-incubator-cloudstack-users-archive@minotaur.apache.org Delivered-To: apmail-incubator-cloudstack-users-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6B71BD9CE for ; Fri, 22 Jun 2012 17:15:00 +0000 (UTC) Received: (qmail 629 invoked by uid 500); 22 Jun 2012 17:15:00 -0000 Delivered-To: apmail-incubator-cloudstack-users-archive@incubator.apache.org Received: (qmail 608 invoked by uid 500); 22 Jun 2012 17:15:00 -0000 Mailing-List: contact cloudstack-users-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cloudstack-users@incubator.apache.org Delivered-To: mailing list cloudstack-users@incubator.apache.org Received: (qmail 599 invoked by uid 99); 22 Jun 2012 17:15:00 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Jun 2012 17:15:00 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of alexeyzilber@gmail.com designates 209.85.213.175 as permitted sender) Received: from [209.85.213.175] (HELO mail-yx0-f175.google.com) (209.85.213.175) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Jun 2012 17:14:55 +0000 Received: by yenl13 with SMTP id l13so1640824yen.6 for ; Fri, 22 Jun 2012 10:14:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=cPcuGGt6NTl/8Wq/732etofeogeA8bWGaqvpW/ekdoo=; b=zSx/KE/KkaEXKbaegyooPI4LwWCQSkJ3IhFqoAepbKPpnDRpcpnLDXnFOwFNo1mz4R 8ms4VnoGiBazrUSFpBfScrp8B3gOlMn1E2VCFarqFJc46izROVO3g65CNgKaATM5YED5 Br9kLqcmUdGNUp2+nKeO4VqqEZ51a4JhfdxYywdl6Slph3aMBD+jfFiHXxdcyDGRBwRZ M6TECdzj5EUonnkdsbHGQUDYWC6/VkCdMrEYegsd2F9lrjmtgpgjW7g9lysBmSRMnuJA 1DMt73DHMt7VzidLTSG+vub49E7VWgobjIDaAr1i8Q+pVkwtFiW4lxsh+zB0HhdK7wKd 74oQ== Received: by 10.236.170.38 with SMTP id o26mr3329911yhl.21.1340385274436; Fri, 22 Jun 2012 10:14:34 -0700 (PDT) MIME-Version: 1.0 Received: by 10.236.143.34 with HTTP; Fri, 22 Jun 2012 10:14:14 -0700 (PDT) In-Reply-To: <67EF18FDCA335F489B366120481AB6C5EE3AC0D49E@BANPMAILBOX01.citrite.net> References: <67EF18FDCA335F489B366120481AB6C5EE3AC0D49E@BANPMAILBOX01.citrite.net> From: Alexey Zilber Date: Sat, 23 Jun 2012 01:14:14 +0800 Message-ID: Subject: Re: Cloudstack agent keeps rebooting kvm host.. To: cloudstack-users@incubator.apache.org Content-Type: multipart/alternative; boundary=20cf302d4e2a251ca004c312c4ba X-Virus-Checked: Checked by ClamAV on apache.org --20cf302d4e2a251ca004c312c4ba Content-Type: text/plain; charset=ISO-8859-1 Is there no way to disable HA checks? I don't understand why it has to do HA checks when there's only one host in the cluster. It seems like a bug to have HA on with a single server. -Alex On Sat, Jun 23, 2012 at 12:55 AM, Suresh Sadhu wrote: > HI Alex, > > When heartbeat fails ,host will reboot continuously till the problem > resolved(heartbeat successful)... > > The heartbeat failure might be caused due to fail to write on mounted > storage path, > Did you see any permission denied messages in the logs ..and does your > mounted storage paths has rw permissions after this problem. because due > some corruption in the mounted FS your mounted file system might become > read-only. That might cause heart-beat failure. > > > > Regards > Sadhu > > > -----Original Message----- > From: Alexey Zilber [mailto:alexeyzilber@gmail.com] > Sent: 22 June 2012 21:52 > To: cloudstack-users@incubator.apache.org > Subject: Cloudstack agent keeps rebooting kvm host.. > > Hi, > > The saga continues! I added a KVM host. The agent decided it wants to > constantly reboot the server: > > 2012-06-23 00:11:32,083{GMT} INFO [cloud.agent.Agent] (Agent-Handler-2:) > Startup Response Received: agent id = 5 > 2012-06-23 00:11:32,083 INFO [cloud.agent.Agent] (Agent-Handler-2:null) > Startup Response Received: agent id = 5 > 2012-06-23 00:12:30,187{GMT} WARN [resource.computing.KVMHAMonitor] > (Thread-7:) write heartbeat failed: Failed to create > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 0 > 2012-06-23 00:12:30,187 WARN [resource.computing.KVMHAMonitor] > (Thread-7:null) write heartbeat failed: Failed to create > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 0 > 2012-06-23 00:12:30,209{GMT} WARN [resource.computing.KVMHAMonitor] > (Thread-7:) write heartbeat failed: Failed to create > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 1 > 2012-06-23 00:12:30,209 WARN [resource.computing.KVMHAMonitor] > (Thread-7:null) write heartbeat failed: Failed to create > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 1 > 2012-06-23 00:12:30,232{GMT} WARN [resource.computing.KVMHAMonitor] > (Thread-7:) write heartbeat failed: Failed to create > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 2 > 2012-06-23 00:12:30,232 WARN [resource.computing.KVMHAMonitor] > (Thread-7:null) write heartbeat failed: Failed to create > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 2 > 2012-06-23 00:12:30,254{GMT} WARN [resource.computing.KVMHAMonitor] > (Thread-7:) write heartbeat failed: Failed to create > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 3 > 2012-06-23 00:12:30,254 WARN [resource.computing.KVMHAMonitor] > (Thread-7:null) write heartbeat failed: Failed to create > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 3 > 2012-06-23 00:12:30,275{GMT} WARN [resource.computing.KVMHAMonitor] > (Thread-7:) write heartbeat failed: Failed to create > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 4 > 2012-06-23 00:12:30,275 WARN [resource.computing.KVMHAMonitor] > (Thread-7:null) write heartbeat failed: Failed to create > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 4 > 2012-06-23 00:12:30,275{GMT} WARN [resource.computing.KVMHAMonitor] > (Thread-7:) write heartbeat failed: Failed to create > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17; reboot the > host > 2012-06-23 00:12:30,275 WARN [resource.computing.KVMHAMonitor] > (Thread-7:null) write heartbeat failed: Failed to create > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17; reboot the > host > > Broadcast message from root@kvm1.xxxxx.xxxx > (unknown) at 0:12 ... > > The system is going down for reboot NOW! > > It looks like the agent was in fact, at least able to create the initial > directory: > > [root@kvm1 ~]# ls -al /mnt/9c2be815-de2b-3c14-84bb-54025d782794 > total 8 > drwxrwxrwx 2 root root 4096 Jun 22 23:58 . > drwxr-xr-x. 4 root root 4096 Jun 22 23:58 .. > > Here's the agent properties file: > > #Storage > #Sat Jun 23 00:11:32 MYT 2012 > guest.network.device=cloudbr0 > workers=5 > private.network.device=cloudbr0 > port=8250 > resource=com.cloud.agent.resource.computing.LibvirtComputingResource > pod=1 > zone=1 > guid=0f0f4f5c-99d0-3813-a7a6-00248cdfd17e > cluster=2 > public.network.device=cloudbr0 > local.storage.uuid=fbefb2ea-f3e0-4f02-96cb-1b8abb6e8c54 > host=10.1.1.18 > LibvirtComputingResource.id=5 > > > First time I'm seeing this error... Last time my kvm setup went well, but > KVM was my first hypervisor, now it's the second. > > Thanks! > Alex > --20cf302d4e2a251ca004c312c4ba--