directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel L├ęcharny <elecha...@gmail.com>
Subject [Mavibot] redux...
Date Tue, 13 Jun 2017 09:18:56 GMT
Hi guys !


3 months without touching this beast, I was quite busy enjoying the new
little me, vacations and API/ApacheDS release (which took me one full
week) ! So I have been able to put some thought on mental paper last weeks.


I came to realize that teh way we process updates is not completly
correct, and need to be amended. Here are two items that need to be
refactored :


BtreeOfBtrees

-------------

This B-tree is supposed to hold all the existing revisions for all the
existing B-trees (until a B-tree/revision is not anymore in use). There
are two problems with this approach :

o although it's necessary to be able to retrieve an old revision for a
given b-tree (because a read transaction is still alive and needs access
to it), we don't need to store that information on disk, because if the
system is stopped, then all those old revisions are simply useless - we
just need the latest revision for each B-tree -. That means we most
certainly don't need to keep a B-tree for that purpose, we can simply
have a Map<B-tree, revision> in memory that is serialiazed when a write
transaction is committed. The serialized data would just be teh offset
of the latest B-tree rootPage, so 8 bytes. That will be a very dense way
to store this information, and it means we will write way less pages on
disk, compared to a standard B-tree update (we are talking about 1 page
being written (for a 512 bytes page, which can hold 63 B-tree offsets),
compared to a B-tree header plus as many nodes and leaves written for an
updated B-tree.

We would also save time at startup, as we won't have anymore to remove
all the dead versions. Not that it's critical, but if we can avoid doing
so, great. OTOH, we will have to fetch the info from the disk for each
B-tree we haven't yet accessed. Not such a hassle though (either we do
it for all teh B-trees at the beginning, or we wait until a B-tree is
accessed teh first time to do so).


o The more important problem is that we currently do updates on a B-tree
instance, like :

    btree.insert( writeTxn, key, value );

This is not good, because in a multi-threaded environment the btree may
be used by more than one thread. Of course, we will have some locks
pending for an update, but if the btree is used by readers, we can't
conveniently make it using one specific version, unless we make the
B-tree stateless and immutable (that means the btree instance doesn't
know anything about its current revision, which is handled by the
transaction).

Another option would be to move the functionality to the RecordManager :

    recordManager.insert( writeTxn, "test", key, value );

Here, we tell the RM to fetch the b-tree by its name, and to add the <K,V>.

This second flavor is easier to implement, because we have one single
access point.


OTOH, we can make teh first flavor working, but it's more complex, and
it requires some substancial modifications on teh way the B-tree is
implemented.


I still have to think more about it.


More to come later !




-- 
Emmanuel Lecharny

Symas.com
directory.apache.org


Mime
View raw message