From dev-return-40795-apmail-directory-dev-archive=directory.apache.org@directory.apache.org Wed Apr 25 23:45:30 2012 Return-Path: X-Original-To: apmail-directory-dev-archive@www.apache.org Delivered-To: apmail-directory-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C14479E0A for ; Wed, 25 Apr 2012 23:45:30 +0000 (UTC) Received: (qmail 99472 invoked by uid 500); 25 Apr 2012 23:45:30 -0000 Delivered-To: apmail-directory-dev-archive@directory.apache.org Received: (qmail 99394 invoked by uid 500); 25 Apr 2012 23:45:30 -0000 Mailing-List: contact dev-help@directory.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Apache Directory Developers List" Delivered-To: mailing list dev@directory.apache.org Received: (qmail 99381 invoked by uid 99); 25 Apr 2012 23:45:30 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Apr 2012 23:45:30 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of elecharny@gmail.com designates 74.125.82.178 as permitted sender) Received: from [74.125.82.178] (HELO mail-we0-f178.google.com) (74.125.82.178) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Apr 2012 23:45:22 +0000 Received: by wera13 with SMTP id a13so522821wer.37 for ; Wed, 25 Apr 2012 16:45:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:reply-to:user-agent:mime-version:to:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=evNYkgJ/8FQR9k8tOqYyXkpbq7I6Q26VoFjXB2i2UZc=; b=ZagVElC8E9nu3vUyQIwc+ImAosi+0CDHfzhBh8LVIEYmsrEIXXzkMdpYPRiJr/VpKd HtP/TfPYPJrhqL+pYiVXwHPpgI4z+hdOSV0LFZ1kPhaDs8WV2d4m6UHqQtxkR3w3kp+o ofLog5JkWyGqdDmRxBIRuFPHPLy6IRHLtXNlstEqj8YoHIn4G7LUM2kmVJeBH/3OJ+Od W0KSxGyUKPFtWAmKx3BaPWbTrELcsdK/PB58lz3S89PgDQHVFvSwE+91Q5frDD1nmDD2 03oHduuixZxGOSabkjmzEvGfA8ihcnfieAf1g5TQuut5eFUXO1D2hUqpLNGdQyz+TkyZ ctEA== Received: by 10.180.80.70 with SMTP id p6mr4732432wix.21.1335397501695; Wed, 25 Apr 2012 16:45:01 -0700 (PDT) Received: from Emmanuels-MacBook-Pro.local (ran75-1-78-192-106-184.fbxo.proxad.net. [78.192.106.184]) by mx.google.com with ESMTPS id fl2sm4459752wib.2.2012.04.25.16.45.00 (version=SSLv3 cipher=OTHER); Wed, 25 Apr 2012 16:45:01 -0700 (PDT) Message-ID: <4F988C7C.1010002@gmail.com> Date: Thu, 26 Apr 2012 01:45:00 +0200 From: =?UTF-8?B?RW1tYW51ZWwgTMOpY2hhcm55?= Reply-To: elecharny@apache.org User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:11.0) Gecko/20120327 Thunderbird/11.0.1 MIME-Version: 1.0 To: Apache Directory Developers List Subject: Re: JDBM + MVCC LRUCache concern, take 2 References: <4F7CC99C.40005@gmail.com> <4F7CD4BB.1030109@gmail.com> In-Reply-To: <4F7CD4BB.1030109@gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Le 4/5/12 1:09 AM, Emmanuel Lécharny a écrit : > Le 4/5/12 12:43 AM, Selcuk AYA a écrit : >> On Wed, Apr 4, 2012 at 3:22 PM, Emmanuel >> Lécharny wrote: >>> It's systematic, and I guess that the fact we now pond the RdnIndex >>> table >>> way more often than before (just because we don't call anymore the >>> OneLevelIndex) cause the cache to get filled and not released fast >>> enough. >> do we hold a cursor open while this code gets stuck? I would think we >> hold a cursor open and moduify quite a bit of jdbm btree pages for >> this kind of behavior to happen. > > I'll check that. >>> As we don't set any size for the cache, its default size is 1024. >>> For some >>> of the tests, this mightnot be enough, as we load a lot of entries >>> (typically the schema elements) plus many others that get added and >>> removed >>> while running tests in revert mode. >>> >>> If I increase the default size to 65536, the tests are passing. >>> >>> Ok, now, I have to admit I haven't - yet - looked at the LRUCache >>> code, and >>> my analysis is just based on what I saw by quickly looking at the >>> code, the >>> stack traces I have added and some few blind guesses. >>> However, I think we have a serious issue here. As far as I can tel, >>> the code >>> itself is probably not responsible for this behaviour, but the way >>> we use it >>> is. >>> >>> Did I missed something ? Is there anything we can do - except >>> increase the >>> cache size - to get the tests passing fine ? >>> >>> I'm more concern about what could occur in real life, when some >>> users will >>> load the server up to a point it just stop responding... >> to aovid this issue, we can let the writers allocate more cache >> pages(rather than keeping the cache size fixed) so that they do not >> loop waiting for a replaceable cache. However, I would again suggest >> making sure we do not forget the cursor open. If we forget a cursor >> open and keep allocating new cache pages for writes, we will have >> other problems. > Yeah, I can see how it may affect the tests. I'll definitively > investigate this first, before going any further in another direction. > > ATM, I'm using a not committed version of JDBM were the default cache > size has been changed. > > Thanks a lot Selcuk ! So I still have the LRUCache size issue, after having removed the SubLevel index. Once I increased the size to 1 << 16, tests are passing. The failing tests are the SearchAuthorizationIT class' tests. What happens is that when I add an entry, I update many elements in the RdnIndex, as I have to modify the nbDescendant in all its parents. As those tests are injecting a lot of entries, so they do a lot of modifications in the RdnIndex. I checked that all the cursors are correctly closed. Any clue ? -- Regards, Cordialement, Emmanuel Lécharny www.iktek.com