directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel Lecharny <>
Subject Re: [jira] [Commented] (DIRSERVER-1642) Unexpected behaviour in JdbmIndex
Date Fri, 19 Aug 2011 09:36:47 GMT
On 8/19/11 10:30 AM, Selcuk AYA wrote:
> On Fri, Aug 19, 2011 at 10:14 AM, Emmanuel Lecharny<>  wrote:
>> On 8/19/11 8:51 AM, Stefan Seelmann wrote:
>>> On Thu, Aug 18, 2011 at 10:41 PM, Selcuk AYA<>    wrote:
>>>> Hi,
>>>> Today we had some discussion with Alex, Emmanuel and others on how we
>>>> can improve jdbm consistency semantics. I  had spent sometime looking
>>>> into this issue and thought it could be useful to put a summary of my
>>>> findings here.
>>>> Currently, jdbm has issues with both concurrency and consistency:
>>>> 1) jdbm table  lookups, insert and remove interfaces are synchronized
>>>> methods. So even if all the directory server does is to lookups on
>>>> tables, all lookups will be serialized. Moreover, the record manager
>>>> operations are all synchronized methods too. This means, for example,
>>>> while sync of dirty pages to disk goes on, no lookup operation can go
>>>> ahead.
>>>> 2) jdbm browser interface does not provide any consistency guarantees.
>>>> If there are underlying changes to the store while the browser is
>>>> open, then it might return inconsistent results. I think the situation
>>>> is even worse if the underlying record manager is CacheRecordManager
>>>> as the same page could be modified and read by a browser concurrently.
>>>> I have been working on a scheme which introduces what can be defined
>>>> as action consistency into the jdbm store.
>>>> 1) Actions are lookup, insert, remove and browse. Each action is
>>>> assigned a unique version. Actions are ReadWrite or ReadOnly.
>>>> 2) We allow one ReadWrite action and multiple ReadOnly actions to run
>>>> concurrently.So synchronized methods will be removed.
>>>> 3)We introduce a new record manager which caches jdbm B+ pages. Each
>>>> page in the cache has a [startVersion, endVersion). When an action
>>>> with version V1 wants to read a page, its read can be satisfied
>>>> satisfied from that page's version with V1>= startVersion&&  
>>>> endVersion.
>>>> 4) Pages' previous versions are kept in memory. A page can be purged
>>>> when the minimum version among all active actions is>= endVersion.
>>>> So say we have three pages in a chain (A0->B0->C0) and each of them
>>>> has version range [0, infinity). An write action starts and gets the
>>>> version number 1. It updates B0 and C0 to B1 and C1 in any order.
>>>> After these two updates, B0 and C0 will have version range [0,1) and
>>>> and B1 and C1 will have version range [1,infinity). Before the write
>>>> action completes, a read action comes, gets the current read version
>>>> which  is 0 and reads the chain. Since B0 and C0 will be the versions
>>>> that can satisfy this read, the read only action will read the chain
>>>> A0->B0->C0. When write action completes, it posts version 1 as the
>>>> read version. First read action completes, a second one starts with
>>>> version 1 and that one will read A0->B1->C1. Since the minimum read
>>>> version is now 1, B0 and C0 can be zapped.
>>> Here I have a question: How can we detect that the read is finished?
>>> In the current JDBM implementation the "browse" action can take
>>> forever, there is no way to tell JDBM that browse is finished (i.e. a
>>> close() method).
> that is true. We will need to add a close() to the browse interface
> and that should tell jdbm that the read finished. Since browse is
> embedded under cursor and cursor is supposed to be closed at some
> point ( ? ), this is reasonable I think.

The cursor will be closed when we have read all the entries.
>> First, browse will last at some point. The most we can do is to read *all*
>> the entries from the master table using an index, but once it's done, the
>> browse will stop. I wondered yesterday if a persistent search could change
>> anything but no : the way it's handled is very different, we just register
>> some listeners in the EventInterceptor, and every modification will trigger
>> one listener. This is not a browse by all mean.
>> Now, I guess we will have to store the used revision somewhere (like in the
>> searchOperationContext), and when we don't have anymore element to send back
>> to the user, then we can 'close' the browse, releasing the revision.
> I assumed that each action(find, insert, remove,browse) is executed by
> one thread so I thought we can store an actionContext at a thread
> local variable as a thread enters an action. Version number can be
> stored in this context. This way, except the close() call we add to
> the Browse interface, we can keep most of the changes local to jdbm.
Hmmm. The way it works, we execute a search on a single thread, but we 
also associate an operationContext instance which is carried all along 
the filters. Except that this instance is not passed to the JDBM layer. 
So, yes, it's probably a better idea to use a ThreadLocal variable 
here.  Although we have to be sure that we don't reuse what we store in 
this variable.

> With this, an insert implementation within B+tree looks like this for
> example:
> beginAction() // intiialize action context, get a version number
> do the insert
> endAction()
> for Browse, we might have:
> Browse()
> {
> beginAction()
> }
> close()
> {
> endAction()
> }

Sounds good.

Do you want us to create a branch to experiment around these ideas ?

Emmanuel L├ęcharny

View raw message