Return-Path: X-Original-To: apmail-activemq-users-archive@www.apache.org Delivered-To: apmail-activemq-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D3A24D16B for ; Tue, 23 Oct 2012 15:54:15 +0000 (UTC) Received: (qmail 32926 invoked by uid 500); 23 Oct 2012 15:54:15 -0000 Delivered-To: apmail-activemq-users-archive@activemq.apache.org Received: (qmail 32726 invoked by uid 500); 23 Oct 2012 15:54:15 -0000 Mailing-List: contact users-help@activemq.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@activemq.apache.org Delivered-To: mailing list users@activemq.apache.org Received: (qmail 32698 invoked by uid 99); 23 Oct 2012 15:54:14 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Oct 2012 15:54:14 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of gilles.harloux@gmail.com designates 209.85.210.43 as permitted sender) Received: from [209.85.210.43] (HELO mail-da0-f43.google.com) (209.85.210.43) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Oct 2012 15:54:07 +0000 Received: by mail-da0-f43.google.com with SMTP id u36so1851761dak.2 for ; Tue, 23 Oct 2012 08:53:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=pXpz2axA5cUDDq/ETqpM1Eik5Bz/8wocYZkdWWkaFnA=; b=LOi7d9PdFC7EQYd3WnlvuO6zzMn3dL+UKKbFoOD1N31L3Q2C4aLl6UJE4y13o92rXT 8ljB/hdE2kmD1ePx1O8s2O7JgZ0MxCvakN+0QaBtePuxxgtQV6eR/rYp1znIqzC9AuNW R/Vrw1W1vv17f3qSF8xleP7b5CTz4kRDrP9S3ecDpO5I7+3Wel6x1iXctXlpKBBLx2Wd NGKj8v+vTXMM/RfSnukjTdExM9hjsQQS9evUnkJfXImn1HIEukdwtnw743maN18NYJMV XHEYIAOpyD1Mk/FlloDotMP90E61KvMTGWbAR6zqfTjyjGXcf1K7sBOmzlCkXRdVB+G5 RLFA== MIME-Version: 1.0 Received: by 10.68.212.68 with SMTP id ni4mr17372150pbc.107.1351007625972; Tue, 23 Oct 2012 08:53:45 -0700 (PDT) Received: by 10.68.26.103 with HTTP; Tue, 23 Oct 2012 08:53:45 -0700 (PDT) In-Reply-To: References: Date: Tue, 23 Oct 2012 17:53:45 +0200 Message-ID: Subject: Re: KahaDB: No messages, but log files not reclaimed and StorePercentUsage above 100 From: Gilles Harloux To: users@activemq.apache.org Content-Type: text/plain; charset=UTF-8 X-Virus-Checked: Checked by ClamAV on apache.org filed as https://issues.apache.org/jira/browse/AMQ-4128. I agree that AMQ-3866 doesn't say a word about mKahaDB. It just caught my eye because of the symtoms, and the logs look alike; Then, I saw you asked if there was XA involved, hence my reaction: I am not using XA explicitly, my best guess is that its use comes from mKahaDB / cross-store transactions. Anyway, thanks for your help! On Tue, Oct 23, 2012 at 4:13 PM, Gary Tully wrote: > ok, maybe open a new issue, and attach your logs and configure a data > small journal data file so that it can easily be attached once you > find the block in gc. > > If recovery of local xa multi store transactions is the problem, I > would expect some log information from: > org.apache.activemq.store.kahadb.MultiKahaDBTransactionStore > > use setJournalMaxFileLength=24k (it needs to be > 8k) on the > persistence adapter. > > It may require killing the broker in the debugger to get to the bottom > of this or using a system property that can kill the broker to force > recovery in this case. > > An issue/jira ticket will put it on the radar in any event. I don't > see that AMQ-3866 is using mkahadb. > > On 23 October 2012 14:47, Gilles Harloux wrote: >> I doubt I can reproduce it as a unit test: What I see happens >> (sometimes, sometimes not) because I sigkill the process. If I replace >> the kill with an orderly shutdown, everything works out ok. >> >> On Tue, Oct 23, 2012 at 3:38 PM, Gary Tully wrote: >>> do you think you could come up with a unit test that shows growth. So >>> something that uses mkahadb and produces or consumed in a transaction >>> from destinations that span the stores. >>> >>> You could configure the size of journal data files to 16k or >>> something, such that any issue shows up very quickly. >>> >>> A test case like that would help clarify the use case and if it can >>> reproduce, would be a great start to getting a quick resolution. >>> >>> There is a test that does transacted send received across stores, but >>> does not check for cleanup. It may be easy to extend it in that way. >>> >>> in activemq-core >>> org.apache.activemq.store.StorePerDestinationTest#testTransactedSendReceiveAcrossStores >>> >>> >>> On 23 October 2012 14:20, Gilles Harloux wrote: >>>> I found that link while you were answering, it seems . I also found >>>> https://issues.apache.org/jira/browse/AMQ-3866 and it looks a lot like >>>> what I experience. >>>> >>>> I tried to delete the index, start a broker and debug as explained in >>>> that ticket. Breakpointing at MessageDatabase#process, line 904 >>>> (version 5.7.0) tells me it seems there is XA in the mix (data is a >>>> KahaAddMessageCommand instance, with >>>> data.f_transactionInfo.f_localTransactionId == null and >>>> data.f_transactionInfo.f_xaTransactionId != null). I guess it's due to >>>> the use of mKahaDB, where I am transacting across both stores. >>>> >>>> In that ticket, you asked for the relevant data file. I can share mine >>>> if there's any interest -- I just have to find a convenient way to >>>> provide you with a 32 Mb file. >>>> >>>> On Tue, Oct 23, 2012 at 12:24 PM, Gary Tully wrote: >>>>> have a read of http://activemq.apache.org/why-do-kahadb-log-files-remain-after-cleanup.html >>>>> >>>>> The logging referenced there will show you what destinations are >>>>> holding on to references to the journal data files. >>>>> >>>>> w.r.t the usage %, the journal size increases in chunks of data file >>>>> size, so a new journal data file can push the usage over the limit. >>>>> >>>>> On 23 October 2012 09:15, Gilles Harloux wrote: >>>>>> Hi, >>>>>> >>>>>> I have a system embedding a broker with KahaDB as a store. I am trying >>>>>> to get a feel about disaster recovery behavior. So what I am basically >>>>>> doing is randomly kill & restart the process. I see a condition (after >>>>>> killing & restarting multiple times) where the messages get consumed, >>>>>> but kahaDB journal files (db-*.log) don't get reclaimed. As I set up a >>>>>> storeUsage limit, it ends up blocking the system. >>>>>> >>>>>> I say the messages get consumed, because I can see my application >>>>>> logging what messages it gets to handle; Also, JMX tells me that the >>>>>> broker's TotalMesageCount is zero, while the StorePercentUsage is >>>>>> above 100 (depending on parameters such as message size & rate, I saw >>>>>> anything from 112 to 293 percent usage). >>>>>> >>>>>> So, two questions: >>>>>> - How is it possible (for any reasons) to get a usage percentage >>>>>> above 100? (In other situations, I saw it happen with memory too). >>>>>> - Why is it I can't get kahaDB to reclaim seemingly unused journal files? >>>>>> >>>>>> TIA, >>>>> >>>>> >>>>> >>>>> -- >>>>> http://redhat.com >>>>> http://blog.garytully.com >>> >>> >>> >>> -- >>> http://redhat.com >>> http://blog.garytully.com > > > > -- > http://redhat.com > http://blog.garytully.com