Return-Path: X-Original-To: apmail-cloudstack-dev-archive@www.apache.org Delivered-To: apmail-cloudstack-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BADE618CD9 for ; Fri, 12 Jun 2015 14:10:56 +0000 (UTC) Received: (qmail 75287 invoked by uid 500); 12 Jun 2015 14:10:51 -0000 Delivered-To: apmail-cloudstack-dev-archive@cloudstack.apache.org Received: (qmail 75187 invoked by uid 500); 12 Jun 2015 14:10:51 -0000 Mailing-List: contact dev-help@cloudstack.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cloudstack.apache.org Delivered-To: mailing list dev@cloudstack.apache.org Received: (qmail 75163 invoked by uid 99); 12 Jun 2015 14:10:51 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Jun 2015 14:10:51 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id BA5581A5484; Fri, 12 Jun 2015 14:10:50 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.901 X-Spam-Level: ** X-Spam-Status: No, score=2.901 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, WEIRD_PORT=0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id NTd-C2qg701E; Fri, 12 Jun 2015 14:10:45 +0000 (UTC) Received: from mail-wg0-f49.google.com (mail-wg0-f49.google.com [74.125.82.49]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 486B124C28; Fri, 12 Jun 2015 14:10:45 +0000 (UTC) Received: by wgez8 with SMTP id z8so25936469wge.0; Fri, 12 Jun 2015 07:10:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=GrfJ6Q1tKpBYNBsuJhDg9qkMALNemryunI+XqHuE+j0=; b=YwgRVidtcjSkT7JhBMKgp5cgtdZxWzG8ycExYcK2kQFoos3eDg/ShYxsByNCdlOKkI 97Aja/yrW8Vxt/8eOAILqMJPV1nCwNR4hzrh/Er/4jY3wgfqzFFtDYIn2PR0Qrx2pKlo CelDGBzQu5f7dHqe57eydyblYH1pgR5H06GTAAfa7Rh2bl03UTcyh+xKBAOIQuJEWYBx TxwHa/0n9JUwvmEe7grwtbW+nfqyXGI64qvke581Xqp/4MATh6wKcoRclN0iwF21K8Cw KiPiLEJQMz4r2mlsOyVqIVknzEY5ZXUy0tltesDi3KB/Tn6Q5WIDWN+s2r05xjiONxJQ 7Xlw== MIME-Version: 1.0 X-Received: by 10.180.39.147 with SMTP id p19mr7355434wik.15.1434118200006; Fri, 12 Jun 2015 07:10:00 -0700 (PDT) Received: by 10.28.91.130 with HTTP; Fri, 12 Jun 2015 07:09:59 -0700 (PDT) In-Reply-To: References: <25FE2217-D6D4-46DE-9191-FF1861C61CB8@citrix.com> Date: Fri, 12 Jun 2015 16:09:59 +0200 Message-ID: Subject: Re: Strange bug? "spam" in management log files... From: Andrija Panic To: "dev@cloudstack.apache.org" Cc: "users@cloudstack.apache.org" Content-Type: multipart/alternative; boundary=001a11c36526e0bb19051852a9a4 --001a11c36526e0bb19051852a9a4 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable I'm sorry for spaming, it seems that in my db.properties file on second MGMT srever, I had 127.0.0.1 as the Cluster IP. After this was changed to real IP address, it seems now that I dont have those "spam" log messages, seems all fine for some hours. On 5 June 2015 at 16:24, Andrija Panic wrote: > Hi, > > any hint on how to proceed ? > > on haproxy I see rougly 50%/50% sessions across 2 backend servers. > But inside DB, it all points to the one mgmt_server_ip... > > Thanks, > Andrija > > On 4 June 2015 at 19:27, Andrija Panic wrote: > >> And if of any help another hint: >> >> while Im having this lines sent to logs in high volume...if I stop secon= d >> mgmt server, first one (that is making all these lines, doesnt stop to m= ake >> them), so log is still heavily writen to - only when I also restart mgmt= on >> 1st node (2nd node is down), then these log lines dissapear. >> >> Thx >> >> On 4 June 2015 at 19:19, Andrija Panic wrote: >> >>> And I could add - these lines (in this volume) only appears on first >>> mgmt server (Actually I have 2 separate, but identical ACS installation= s, >>> and same behaviour). >>> >>> On 4 June 2015 at 19:18, Andrija Panic wrote: >>> >>>> Just checked, in the HOSTS table, all agents are connected (via >>>> haproxy) to the first mgmt server...I just restarted haproxy, and stil= l >>>> inside the DB, it says same mgmt_server_id for all agents - which is n= ot >>>> really true. >>>> >>>> Actually, on the haproxy itslef (statistics page) I can see almoust >>>> 50%-50% distribution across 2 backends - which means by haproxy it sho= uld >>>> be fine. >>>> total 18 agents, 10 goes to 1 backend, 8 goes to other backend (ACS >>>> mgmt server) >>>> >>>> This is our haproxy config, I think it's fine, but... DB says >>>> differently, althouh haproxy statistick say all fine >>>> >>>> ### ACS 8250 >>>> ######################################################################= ################# >>>> frontend front_ACS_8250 10.20.10.100:8250 >>>> option tcplog >>>> mode tcp >>>> default_backend back_8250 >>>> backend back_8250 >>>> mode tcp >>>> balance source >>>> server acs1_8250 10.20.10.7:8250 check port 8250 inter 2000 >>>> rise 3 fall 3 >>>> server acs2_8250 10.20.10.8:8250 check port 8250 inter 2000 >>>> rise 3 fall 3 >>>> >>>> ######################################################################= ############################ >>>> >>>> Any info on how to proceed with this, since because of these lines, it >>>> makes mgmt logs almoust unreadable... :( >>>> >>>> Thanks, >>>> Andrija >>>> >>>> On 4 June 2015 at 19:00, Andrija Panic wrote= : >>>> >>>>> Thanks Koushik, >>>>> >>>>> I will check and let you know - but 11GB log file for 10h ? I dont >>>>> expect this is expected :) >>>>> I understand that the message is there because of setup, just an awfu= l >>>>> lot of lines.... >>>>> >>>>> Will check thx for the help ! >>>>> >>>>> Andrija >>>>> >>>>> On 4 June 2015 at 18:53, Koushik Das wrote: >>>>> >>>>>> This is expected in a clustered MS setup. What is the distribution o= f >>>>>> HV hosts across these MS (check host table in db for MS id)? MS owni= ng the >>>>>> HV host processes all commands for that host. >>>>>> Grep for the sequence numbers (for e.g. 73-7374644389819187201) in >>>>>> both MS logs to correlate. >>>>>> >>>>>> >>>>>> >>>>>> On 04-Jun-2015, at 8:30 PM, Andrija Panic >>>>>> wrote: >>>>>> >>>>>> > Hi, >>>>>> > >>>>>> > I have 2 ACS MGMT servers, loadbalanced properly (AFAIK), and >>>>>> sometimes it >>>>>> > happens that on the first node, we have extremem number of folowin= g >>>>>> line >>>>>> > entries in the log fie, which causes many GB log in just few hours >>>>>> or less: >>>>>> > (as you can see here they are not even that frequent, but >>>>>> sometimes, it >>>>>> > gets really crazy with the speed/numer logged per seconds: >>>>>> > >>>>>> > 2015-06-04 16:55:04,089 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] >>>>>> > (AgentManager-Handler-29:null) Seq 73-7374644389819187201: MgmtId >>>>>> > 90520745449919: Resp: Routing to peer >>>>>> > 2015-06-04 16:55:04,129 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] >>>>>> > (AgentManager-Handler-28:null) Seq 1-3297479352165335041: MgmtId >>>>>> > 90520745449919: Resp: Routing to peer >>>>>> > 2015-06-04 16:55:04,129 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] >>>>>> > (AgentManager-Handler-8:null) Seq 73-7374644389819187201: MgmtId >>>>>> > 90520745449919: Resp: Routing to peer >>>>>> > 2015-06-04 16:55:04,169 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] >>>>>> > (AgentManager-Handler-26:null) Seq 1-3297479352165335041: MgmtId >>>>>> > 90520745449919: Resp: Routing to peer >>>>>> > 2015-06-04 16:55:04,169 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] >>>>>> > (AgentManager-Handler-30:null) Seq 73-7374644389819187201: MgmtId >>>>>> > 90520745449919: Resp: Routing to peer >>>>>> > 2015-06-04 16:55:04,209 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] >>>>>> > (AgentManager-Handler-27:null) Seq 1-3297479352165335041: MgmtId >>>>>> > 90520745449919: Resp: Routing to peer >>>>>> > 2015-06-04 16:55:04,209 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] >>>>>> > (AgentManager-Handler-2:null) Seq 73-7374644389819187201: MgmtId >>>>>> > 90520745449919: Resp: Routing to peer >>>>>> > 2015-06-04 16:55:04,249 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] >>>>>> > (AgentManager-Handler-4:null) Seq 1-3297479352165335041: MgmtId >>>>>> > 90520745449919: Resp: Routing to peer >>>>>> > 2015-06-04 16:55:04,249 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] >>>>>> > (AgentManager-Handler-7:null) Seq 73-7374644389819187201: MgmtId >>>>>> > 90520745449919: Resp: Routing to peer >>>>>> > 2015-06-04 16:55:04,289 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] >>>>>> > (AgentManager-Handler-3:null) Seq 1-3297479352165335041: MgmtId >>>>>> > 90520745449919: Resp: Routing to peer >>>>>> > 2015-06-04 16:55:04,289 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] >>>>>> > (AgentManager-Handler-5:null) Seq 73-7374644389819187201: MgmtId >>>>>> > 90520745449919: Resp: Routing to peer >>>>>> > 2015-06-04 16:55:04,329 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] >>>>>> > (AgentManager-Handler-1:null) Seq 1-3297479352165335041: MgmtId >>>>>> > 90520745449919: Resp: Routing to peer >>>>>> > 2015-06-04 16:55:04,330 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] >>>>>> > (AgentManager-Handler-15:null) Seq 73-7374644389819187201: MgmtId >>>>>> > 90520745449919: Resp: Routing to peer >>>>>> > 2015-06-04 16:55:04,369 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] >>>>>> > (AgentManager-Handler-11:null) Seq 1-3297479352165335041: MgmtId >>>>>> > 90520745449919: Resp: Routing to peer >>>>>> > 2015-06-04 16:55:04,369 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] >>>>>> > (AgentManager-Handler-17:null) Seq 73-7374644389819187201: MgmtId >>>>>> > 90520745449919: Resp: Routing to peer >>>>>> > 2015-06-04 16:55:04,409 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] >>>>>> > (AgentManager-Handler-14:null) Seq 1-3297479352165335041: MgmtId >>>>>> > 90520745449919: Resp: Routing to peer >>>>>> > 2015-06-04 16:55:04,409 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] >>>>>> > (AgentManager-Handler-12:null) Seq 73-7374644389819187201: MgmtId >>>>>> > 90520745449919: Resp: Routing to peer >>>>>> > >>>>>> > >>>>>> > We have haproxy VIP, to which SSVM connects, and all cloudstack >>>>>> agents >>>>>> > (agent.properties file). >>>>>> > >>>>>> > Any suggestions, how to avoid this - I noticed when I turn off >>>>>> second ACS >>>>>> > MGMT server, and then reboot first one (restart >>>>>> cloudstack-management) it >>>>>> > stops and behaves nice :) >>>>>> > >>>>>> > This is ACS 4.5.1, Ubuntu 14.04 for mgmt nodes. >>>>>> > >>>>>> > Thanks, >>>>>> > -- >>>>>> > >>>>>> > Andrija Pani=C4=87 >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Andrija Pani=C4=87 >>>>> >>>> >>>> >>>> >>>> -- >>>> >>>> Andrija Pani=C4=87 >>>> >>> >>> >>> >>> -- >>> >>> Andrija Pani=C4=87 >>> >> >> >> >> -- >> >> Andrija Pani=C4=87 >> > > > > -- > > Andrija Pani=C4=87 > --=20 Andrija Pani=C4=87 --001a11c36526e0bb19051852a9a4--