Return-Path: X-Original-To: apmail-cloudstack-users-archive@www.apache.org Delivered-To: apmail-cloudstack-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EC05E10963 for ; Mon, 23 Feb 2015 16:22:26 +0000 (UTC) Received: (qmail 21111 invoked by uid 500); 23 Feb 2015 16:22:24 -0000 Delivered-To: apmail-cloudstack-users-archive@cloudstack.apache.org Received: (qmail 21053 invoked by uid 500); 23 Feb 2015 16:22:24 -0000 Mailing-List: contact users-help@cloudstack.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@cloudstack.apache.org Delivered-To: mailing list users@cloudstack.apache.org Received: (qmail 21031 invoked by uid 99); 23 Feb 2015 16:22:23 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 23 Feb 2015 16:22:23 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of shadowsor@gmail.com designates 74.125.82.174 as permitted sender) Received: from [74.125.82.174] (HELO mail-we0-f174.google.com) (74.125.82.174) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 23 Feb 2015 16:22:19 +0000 Received: by wesu56 with SMTP id u56so19632745wes.10; Mon, 23 Feb 2015 08:21:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=NV+k3qX8uiIT+6nl8i6bEw8gkhQGNwa1ICWz7Wb8ra4=; b=T2zJpaeYVD3suYbmu5Q0CWKjWQEpYccjLNlWyxDwf5R8lagmgx2ymPwuds209y85U4 jy3Kvf0Kh8nBhxPBo0ibzH6Bc+qTKa10c5XJNQYRMCC+Rh69qzVkmnQW4JPOQy7uZgTX B6VZ8eAOHhkGz1qtFok3HXkV2Zld9EiZVIallGB6soRWG81aJgpw4lyif+2qTzu9Pgcp ZSYpH+9Gz5+fOJm5Z3aVMPgUjOP/7fxsmgCrZfkfMU4GZKHT6Sn4C98R3iWmdgjcojtA GjzNd0ZE6MPJNVObkRuMm3k4FhwKDOI5fiOySq7edhNqUEyQwQz7FhsxKIhgsKMxeEVW 9ekQ== MIME-Version: 1.0 X-Received: by 10.194.186.200 with SMTP id fm8mr23931934wjc.138.1424708518703; Mon, 23 Feb 2015 08:21:58 -0800 (PST) Received: by 10.28.167.138 with HTTP; Mon, 23 Feb 2015 08:21:58 -0800 (PST) In-Reply-To: References: <1424465766441.19696@ena.com> Date: Mon, 23 Feb 2015 08:21:58 -0800 Message-ID: Subject: Re: Agent dies every night/morning.... memory violation From: Marcus To: "dev@cloudstack.apache.org" Cc: "users@cloudstack.apache.org" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org It doesn't really sound like an agent problem, but some other root problem that is causing issues for the agent. Perhaps it is specific to the host simply because there is a particular VM that always runs on that host and the VM itself is triggering the issue. Perhaps a heavy logrotate or cron job on the vm causes issues for librados. Just grasping at straws here. From the output provided it does seem that the libvirt bindings that include ceph code are terminating the agent execution. My guess is that if you focus on "why this host" as opposed to "what's going on", you'll find the answer to both. Sorry, I know that's not much help. On Mon, Feb 23, 2015 at 7:29 AM, Andrija Panic wr= ote: > Anybody?, before I start to cry :( > > On 21 February 2015 at 21:18, Andrija Panic wro= te: > >> HI Simon, >> >> selinux is disabled, I have just double checked. >> >> BTW, this is what I can see in the cloudstack-agent.err log - seems like >> some CEPH related issues, but not sure why would agent die... >> If I recall correclty, this might be happening since the CEPH update fro= m >> 0.80.3? to 0.87 - and this seesm like some crash in librados.... >> >> >> libust[1907/2046]: Warning: HOME environment variable not set. Disabling >> LTTng-UST per-user tracing. (in setup_local_apps() at lttng-ust-comm.c:3= 05) >> libvirt: error : name in virDomainLookupByName must not be NULL >> libvirt: error : name in virDomainLookupByName must not be NULL >> libvirt: error : name in virDomainLookupByName must not be NULL >> libvirt: error : name in virDomainLookupByName must not be NULL >> libvirt: Storage Driver error : failed to remove volume >> 'cloudstack/bd751250-de35-4d2e-a4e3-3ee4b636c2a7': Device or resource bu= sy >> ./log/SubsystemMap.h: In function 'bool >> ceph::log::SubsystemMap::should_gather(unsigned int, int)' thread >> 7f04427fc700 time 2015-02-21 06:39:38.839210 >> ./log/SubsystemMap.h: 62: FAILED assert(sub < m_subsys.size()) >> ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578) >> 1: (()+0x1fe223) [0x7f060c932223] >> 2: (ObjectCacher::flusher_entry()+0x155) [0x7f060c9866e5] >> 3: (ObjectCacher::FlusherThread::entry()+0xd) [0x7f060c9976cd] >> 4: (()+0x79d1) [0x7f06605ee9d1] >> 5: (clone()+0x6d) [0x7f066033bb5d] >> NOTE: a copy of the executable, or `objdump -rdS ` is neede= d >> to interpret this. >> terminate called after throwing an instance of 'ceph::FailedAssertion' >> 21/02/2015 06:39:38 1905 jsvc.exec error: Service did not exit cleanly >> >> On 20 February 2015 at 21:56, Simon Weller wrote: >> >>> Andrija, >>> >>> What is SELinux set to on this host? >>> >>> >>> - SI >>> >>> >>> ________________________________________ >>> From: Andrija Panic >>> Sent: Friday, February 20, 2015 6:06 AM >>> To: dev@cloudstack.apache.org; users@cloudstack.apache.org >>> Subject: Agent dies every night/morning.... memory violation >>> >>> Hi, >>> >>> I have crazy agent on one of the hosts, that is being killed each morni= ng >>> and I found this in /var/log/audit.log: >>> >>> type=3DANOM_ABEND msg=3Daudit(1424321463.930:430678): auid=3D0 uid=3D0 = gid=3D0 >>> ses=3D68891 pid=3D10831 comm=3D"jsvc" reason=3D"memory violation" sig= =3D6 >>> >>> I dont remember changing anything on the system, but this keeps happeni= ng >>> each morning arrond same time 5.20am-5.40am. >>> >>> I'm wondering what the hack is happening, any suggestions where to >>> troubleshoot ? >>> Will check logs in details anyway... >>> >>> -- >>> >>> Andrija Pani=C4=87 >>> >> >> >> >> -- >> >> Andrija Pani=C4=87 >> > > > > -- > > Andrija Pani=C4=87