directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Selcuk AYA <>
Subject Re: [jira] [Commented] (DIRSERVER-1642) Unexpected behaviour in JdbmIndex
Date Fri, 19 Aug 2011 11:19:30 GMT
On Fri, Aug 19, 2011 at 12:36 PM, Emmanuel Lecharny <> wrote:
> On 8/19/11 10:30 AM, Selcuk AYA wrote:
>> On Fri, Aug 19, 2011 at 10:14 AM, Emmanuel Lecharny<>
>>  wrote:
>>> On 8/19/11 8:51 AM, Stefan Seelmann wrote:
>>>> On Thu, Aug 18, 2011 at 10:41 PM, Selcuk AYA<>
>>>>  wrote:
>>>>> Hi,
>>>>> Today we had some discussion with Alex, Emmanuel and others on how we
>>>>> can improve jdbm consistency semantics. I  had spent sometime looking
>>>>> into this issue and thought it could be useful to put a summary of my
>>>>> findings here.
>>>>> Currently, jdbm has issues with both concurrency and consistency:
>>>>> 1) jdbm table  lookups, insert and remove interfaces are synchronized
>>>>> methods. So even if all the directory server does is to lookups on
>>>>> tables, all lookups will be serialized. Moreover, the record manager
>>>>> operations are all synchronized methods too. This means, for example,
>>>>> while sync of dirty pages to disk goes on, no lookup operation can go
>>>>> ahead.
>>>>> 2) jdbm browser interface does not provide any consistency guarantees.
>>>>> If there are underlying changes to the store while the browser is
>>>>> open, then it might return inconsistent results. I think the situation
>>>>> is even worse if the underlying record manager is CacheRecordManager
>>>>> as the same page could be modified and read by a browser concurrently.
>>>>> I have been working on a scheme which introduces what can be defined
>>>>> as action consistency into the jdbm store.
>>>>> 1) Actions are lookup, insert, remove and browse. Each action is
>>>>> assigned a unique version. Actions are ReadWrite or ReadOnly.
>>>>> 2) We allow one ReadWrite action and multiple ReadOnly actions to run
>>>>> concurrently.So synchronized methods will be removed.
>>>>> 3)We introduce a new record manager which caches jdbm B+ pages. Each
>>>>> page in the cache has a [startVersion, endVersion). When an action
>>>>> with version V1 wants to read a page, its read can be satisfied
>>>>> satisfied from that page's version with V1>= startVersion&&
>>>>> endVersion.
>>>>> 4) Pages' previous versions are kept in memory. A page can be purged
>>>>> when the minimum version among all active actions is>= endVersion.
>>>>> So say we have three pages in a chain (A0->B0->C0) and each of
>>>>> has version range [0, infinity). An write action starts and gets the
>>>>> version number 1. It updates B0 and C0 to B1 and C1 in any order.
>>>>> After these two updates, B0 and C0 will have version range [0,1) and
>>>>> and B1 and C1 will have version range [1,infinity). Before the write
>>>>> action completes, a read action comes, gets the current read version
>>>>> which  is 0 and reads the chain. Since B0 and C0 will be the versions
>>>>> that can satisfy this read, the read only action will read the chain
>>>>> A0->B0->C0. When write action completes, it posts version 1 as
the new
>>>>> read version. First read action completes, a second one starts with
>>>>> version 1 and that one will read A0->B1->C1. Since the minimum
>>>>> version is now 1, B0 and C0 can be zapped.
>>>> Here I have a question: How can we detect that the read is finished?
>>>> In the current JDBM implementation the "browse" action can take
>>>> forever, there is no way to tell JDBM that browse is finished (i.e. a
>>>> close() method).
>> that is true. We will need to add a close() to the browse interface
>> and that should tell jdbm that the read finished. Since browse is
>> embedded under cursor and cursor is supposed to be closed at some
>> point ( ? ), this is reasonable I think.
> The cursor will be closed when we have read all the entries.
>>> First, browse will last at some point. The most we can do is to read
>>> *all*
>>> the entries from the master table using an index, but once it's done, the
>>> browse will stop. I wondered yesterday if a persistent search could
>>> change
>>> anything but no : the way it's handled is very different, we just
>>> register
>>> some listeners in the EventInterceptor, and every modification will
>>> trigger
>>> one listener. This is not a browse by all mean.
>>> Now, I guess we will have to store the used revision somewhere (like in
>>> the
>>> searchOperationContext), and when we don't have anymore element to send
>>> back
>>> to the user, then we can 'close' the browse, releasing the revision.
>> I assumed that each action(find, insert, remove,browse) is executed by
>> one thread so I thought we can store an actionContext at a thread
>> local variable as a thread enters an action. Version number can be
>> stored in this context. This way, except the close() call we add to
>> the Browse interface, we can keep most of the changes local to jdbm.
> Hmmm. The way it works, we execute a search on a single thread, but we also
> associate an operationContext instance which is carried all along the
> filters. Except that this instance is not passed to the JDBM layer. So, yes,
> it's probably a better idea to use a ThreadLocal variable here.  Although we
> have to be sure that we don't reuse what we store in this variable.
>> With this, an insert implementation within B+tree looks like this for
>> example:
>> beginAction() // intiialize action context, get a version number
>> do the insert
>> endAction()
>> for Browse, we might have:
>> Browse()
>> {
>> beginAction()
>> }
>> close()
>> {
>> endAction()
>> }
> Sounds good.
> Do you want us to create a branch to experiment around these ideas ?
I already cloned the code and am experimenting on it. Will keep you
posted on how it goes.
> --
> Regards,
> Cordialement,
> Emmanuel Lécharny

Selcuk AYA

View raw message