directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel Lécharny <>
Subject Re: [Mavibot] Transaction support
Date Fri, 30 Oct 2015 07:40:41 GMT
A second thought about the Map kept in the txn context : actually, we
don't need two maps (on for the PageIO, and one for the Page), we just
need to keep a reference to the objects, whether they are instance of
Node or Leaf, or BTreeHeader, or BtreeInfo, or even RecordManagerHeader.
It's enough to have a map on <PageID, Object> where the PageID is an
incremental number referencing the modified object. We will convert this
pageId to an offset later on. All in all, this is a WAL...

Le 29/10/15 11:13, Emmanuel Lécharny a écrit :
> Hi guys,
> here are a bit of my last night's insomnia thoughts (with the shower,
> and the smallest room of my appartment, this is the only place and time
> I have for such thoughts atm ;-) :
> - txn are now going to be explicit, ie you have to begin a txn and
> commit it before and after any update, like in :
>         // Inject some data
>         recordManager.beginTransaction();
>         btree.insert( 1L, "1" );
>         btree.insert( 4L, "4" );
>         btree.insert( 2L, "2" );
>         btree.insert( 3L, "3" );
>         btree.insert( 5L, "5" );
>         recordManager.commit();
> - later on, we may include the txn begin/commit into the update
> operation, on a btree basis.
> - the direct consequence is that we don't write any thing on the disk
> before we commit. This will have a huge impact on performances as every
> single update operation involve many internal updates : BOB Btree, CPB
> Btree, RMH. Typically, in the previous example, we will save 5 updates
> of teh BOB and CPB, plus numerous updates on the RMH, not counting the 4
> updates of the btree root page and btree header.
> - of course, this is not coming for free : if one forget to commit or
> rollback the txn, then the pages will remain in memory...
> - in order to manipulate those pages in memory, we will keep a track of
> the update pages and btree headers in memory. We use Map for that :
>     Map<Long, PageIO> and Map<Long, Page>
> - now, we have two kind of map : one that keep a track on PageIO and the
> other one that keep a track on Page. Why is that ? Because the RMH and
> the BHs are not holding any value, they are specific data structures.
> Every other Page hold  Keys/Values, where value sare data or pointer to
> a child Page.
> - we need a way to reference those Page and PageIO. For PageIO, this is
> easy : we use the page offset. For plain Page, this is a different
> story, because we may create new pages (actually, this *is* what we do
> for every update, except that as we work in memory during a txn, we can
> *modify* an existing page instead of copy it, until we need to write it
> down on disk. Note that the very first time we access to a Page, either
> from teh disk or from the cache, we have to copy it, no matter what).
> - Those new pages don't have an offset, because we haven't written them
> to disk yet. The reason is that writing such a page to the disk means we
> have serialized the page - an expensive operation we want to avoid until
> we have to do it, ie when we write the page to the disk -, and we have
> fetched as many PageIO as necessary to write the page to the disk
> (fetching a new PageIO will first look in the Free Page list or grab new
> PageIO at the end of the file, expending the file).
> - So we have to find a way to identify those pages by other means. One
> idea is to use an incremental ID associated with the txn. Every Page
> will have a Long ID.
> - now, the pb is that we use offset in pages to reference child pages.
> We must be able to distinguish an offset from an ID. We will use a
> negative long for PageID.
> - when we commit the txn, we will have to update all the references to
> IDs to replace them with offsets. At this point, the easiest way is to
> read all the Nodes we keep in the Map, and check all of their values to
> see if we are refering to an ID instead of an Offset, and change that.
> - that drives the order in which we write the pages : we have to start
> from the leaves, and up to the root page
> - another approach would be to keep a track of the modified values for
> each Node (ie, value X has been change is now reffer to page Y, etc...).
> This will be memory consuming, but CPU efficient. All in all, it's
> another Map to handle : Map<PageID, Set<Modified Value position>>. This
> approach is probably what we want to use...
> - modifying a page instead of copying it requires we add new methods in
> the Btree class. Not a huge change.
> - a roolback is just a matter of deleting the txn context: straightforward !
> At this point, I think we have a roadmap for txn support. I said nothing
> about managing free pages here, it has already been discussed in another
> thread, but I'll come later with a more explicit mail.
> I'm curently working in a branch, in which I have removed the InMemory
> Btrees and the support for multiple values. The branch has been created,
> but I haven't commit yet anything in it : I have too many broken things.
> I'll commit asap (may be this week-end if I have time to work on my free
> time - familly comes first !)
> Thanks !
> BOB : Btree-Of-Btrees, the Btree that hold the revisions in use
> CPB : Copied-Pages-Btree, the btree that holds teh list of pages that
> have been copied during an update
> RMH : RecordManager Header, the data structure that maintains the
> pointers to the BOB and CPB, both the current and previous, plus some
> extra informations, like the FreeList page
> BH :  Btree Header, the structure that stors meta-information about a
> B-tree (serializers, comparators, nb elements, pointer to the root page,
> etc)

View raw message