db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Matrigali <mikem_...@sbcglobal.net>
Subject Re: Derby architecture/design documents
Date Wed, 02 Feb 2005 00:17:13 GMT
I will be submitting compare/contrast paper with ARIES/IM.  hopefully
the following will answer this question.

Unlike ARIES/IM SMO (structure modification operations - ie. split),
only happen top down.  This is not as concurrent as bottom up
ARIES/IM, but was simpler.  Do note though that as in ARIES IM "Not more
than 2 index pages are held latched simultaneously at anytime.  In order
to improve concurrency and to avoid deadlocks involving latches, even
those latches are not held while waiting for a lock wich is not
immediately grantable.  No data page latch is held or acquired during an
index access.  Latch coupliing is used while traversing the triee - ie.
the latch on a parent page is held while requesting a latch on a child
page."

So simple case of split exclusive latch is held on P (parent), and
C (child).  If there is room in P, then new page C1 (child right) is
created, new descriminator is inserted into P, rows moved C1.  Latches
are held on P and C for duration of split, and then released.

Hard case is P does not have room for descriminator key.  In this case
all latches are released, and derby does a split pass from top to
bottom, and will split internal node that does not have room for the
decrimator key.  Note this may result in more splits than necessary for
this particular insert, but the assumption is that the splits will have
to happen eventually anyway.  After this split pass is done, the search
for the insert starts again from top down, but it must once again check
for space because it has given up all it's latches and someone else may
have gotten to the space first.

Optimization is possible to remember C and see if it is right location,
and/or use sideway pointers to search right rather than do research of
tree.

Dibyendu Majumdar wrote:

> Philip Bohannon wrote some time back:
> 
>> The classic problem here is a split taking place during a search.  Say  a
>> page C contains 50-100, and C's parent in the tree is P, which has an
>> entry for 50 and 101, with the entry for 50 pointing to C.
>>
>> Now an inserter comes in, inserts 51 and fills up C (or fills up C's
>> child, causes a split, and that fills up C).  So C splits into C and C',
>> which is added to the right of C.
>>
>> Since the inserter can't get a lock on P while holding a lock on C
>> (deadlock), it releases the lock on P.  At this point, a reader comes
>> down
>> the tree looking for 99, looks at P and gets a pointer to C not C'. Now
>> inserter starts waiting for a lock on P.  When reader gets C, 99 has been
>> moved off to C'.
>>
>> It would be interesting to know how Derby handles this (for example,
>> [Yao]
>> proposes having a pointer from C to C', and I forget what Aries IM
>> does at
>> the moment, but I think it starts over at the top if you fail to find a
>> key at the rightmost point in the page...).
> 
> 
> Philip,
> 
> I don't fully understand how Derby works, but from what I have seen so
> far, I think that Derby avoids above situation as follows:
> 
> It latches the parent P first and then the child C exclusively - and
> only then starts the split. The searcher has to wait until the split is
> over.
> 
> In Mike's words:
> 
> [In order to prevent deadlocks latches requested while holding other
> latches are always requested top/down and left to right.  Btree splits
> are always left to right.  If for any reason the code needs to break
> this protocol then it will first request the latch NOWAIT and if it
> can't get the latch it will release all current latches and wait for the
> latch it is trying to get, and then after obtaining it go about getting
> the other latches it needs for the particular operation.  While
> traversing down the tree Derby may hold 2 latches: one on parent and one
> on child.]
> 
> The question is what happens if parent P is also full and needs to be
> split. This is where things get really complicated. If my understanding
> is correct, Derby solves this in a simple and elegant way.
> 
> It releases the latches on C and P, and starts the split at P with the
> separator key. So, at that point, P becomes C, and P's parent, let us
> say Q, becomes P.
> 
> I think that this is repeated recursively up the tree until the original
> P has been split with room to insert the separator key. Then Derby
> proceeds with the original split.
> 
> Perhaps Mike or someone else can confirm or fill in with more details.
> 
> I have not seen above algorithm in any of the papers I have read - but
> then, my reading is limited to following papers:
> 
> C.Mohan and Frank Levine. ARIES/IM: an efficient and high concurrency
> index management method with write-ahead logging. ACM SIGMOD Record. V21
> N2, P371-380, June 1 1992.
> 
> Jim Gray and Andreas Reuter. Chapter 15: Access Paths. Transaction
> Processing: Concepts and Techniques. Morgan Kaufmann Publishers, 1993
> 
> Philip L. Lehman and S.Bing Yao. Efficient Locking for Concurrent
> Operations on B-Trees. Readings in Database Systems, Third Edition,
> 1998. Morgan Kaufmann Publishers.
> 
> David Lomet and Betty Salzburg. Access method concurrency with recovery.
> ACM SIGMOD, V21 N2, p351-360, June 1 1992.
> 
> 
> Regards
> 
> Dibyendu
> 
> 

Mime
View raw message