db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dibyendu Majumdar <dibye...@mazumdar.demon.co.uk>
Subject Re: Derby architecture/design documents
Date Tue, 01 Feb 2005 21:31:35 GMT
Philip Bohannon wrote some time back:
> The classic problem here is a split taking place during a search.  Say  a
> page C contains 50-100, and C's parent in the tree is P, which has an
> entry for 50 and 101, with the entry for 50 pointing to C.
> 
> Now an inserter comes in, inserts 51 and fills up C (or fills up C's
> child, causes a split, and that fills up C).  So C splits into C and C',
> which is added to the right of C.
> 
> Since the inserter can't get a lock on P while holding a lock on C
> (deadlock), it releases the lock on P.  At this point, a reader comes down
> the tree looking for 99, looks at P and gets a pointer to C not C'. Now
> inserter starts waiting for a lock on P.  When reader gets C, 99 has been
> moved off to C'.
> 
> It would be interesting to know how Derby handles this (for example, [Yao]
> proposes having a pointer from C to C', and I forget what Aries IM does at
> the moment, but I think it starts over at the top if you fail to find a
> key at the rightmost point in the page...).

Philip,

I don't fully understand how Derby works, but from what I have seen so 
far, I think that Derby avoids above situation as follows:

It latches the parent P first and then the child C exclusively - and 
only then starts the split. The searcher has to wait until the split is 
over.

In Mike's words:

[In order to prevent deadlocks latches requested while holding other 
latches are always requested top/down and left to right.  Btree splits 
are always left to right.  If for any reason the code needs to break 
this protocol then it will first request the latch NOWAIT and if it 
can't get the latch it will release all current latches and wait for the 
latch it is trying to get, and then after obtaining it go about getting 
the other latches it needs for the particular operation.  While 
traversing down the tree Derby may hold 2 latches: one on parent and one 
on child.]

The question is what happens if parent P is also full and needs to be 
split. This is where things get really complicated. If my understanding 
is correct, Derby solves this in a simple and elegant way.

It releases the latches on C and P, and starts the split at P with the 
separator key. So, at that point, P becomes C, and P's parent, let us 
say Q, becomes P.

I think that this is repeated recursively up the tree until the original 
P has been split with room to insert the separator key. Then Derby 
proceeds with the original split.

Perhaps Mike or someone else can confirm or fill in with more details.

I have not seen above algorithm in any of the papers I have read - but 
then, my reading is limited to following papers:

C.Mohan and Frank Levine. ARIES/IM: an efficient and high concurrency 
index management method with write-ahead logging. ACM SIGMOD Record. V21 
N2, P371-380, June 1 1992.

Jim Gray and Andreas Reuter. Chapter 15: Access Paths. Transaction 
Processing: Concepts and Techniques. Morgan Kaufmann Publishers, 1993

Philip L. Lehman and S.Bing Yao. Efficient Locking for Concurrent 
Operations on B-Trees. Readings in Database Systems, Third Edition, 
1998. Morgan Kaufmann Publishers.

David Lomet and Betty Salzburg. Access method concurrency with recovery. 
ACM SIGMOD, V21 N2, p351-360, June 1 1992.


Regards

Dibyendu


Mime
View raw message