directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel Lecharny <>
Subject Re: About operation atomicity
Date Sat, 29 May 2010 15:43:50 GMT
On 5/29/10 5:15 PM, Stefan Seelmann wrote:
> Hi,
> this discussion makes me think that we should build MVCC directly into
> XDBM. I think it should work independent of the underlying store (JDBM,
> AVL, HBase).
> Let me outline my idea:
> Instead of MasterTable<ID, Entry>  we use a MasterTable<ID,
> SortedMap<Long, Entry>>. It stores multiple versions of the entry in a
> sorted map, the key of the sorted map is the version number. Very same
> for index tables.
> We need some global version information which contains
> - version counter (seqence or timestamp)
> - the latest valid version
> - list of writes in progress, and
> - list of failed writes
> At the beginning of a write or read operations we get a snapshot of this
> global version info. This snapshot is used to get a consistent view to
> the data for the whole operation. For each read data we use the version
> that is less or equal to the "latest" version, excluding versions
> contained in the "in progress" or "failed" list. For a search operation
> this snapshot can also be used by the cursor, while fetching all results
> there would always be a conistent view to the data.
> A write operation works as follows:
> - Aquire the next version number, the version number is added to the
> list of "writes in progess" (begin transaction).
> - All micro-writes to index and master table are performed with this
> version.
> - All write operations add data, deletes/drops result in the addition of
> an empty value<Version, null>  (append-only).
> - For commit the version number is removed from the "writes in progress"
> list and the latest valid version is set (except it isn't already higher).
> - If the write fails then the version number is moved to the "failed
> writes" list.
> At some point we have to do some garbage collection and delete old
> versions. If the version is a timestamp this can be done by the next
> write by deleting versions older than X days.
> I think this way we can also support RFC 5808 LDAP Transactions.
> An big disadvantage may be performance issues as we always have to read
> and write all the versions from/to the underlying store.
> Thoughts?
I'm afraid you will have an issue to update the Map when it contains 
many versions, as you may have more than one thread accessing this Map.

You are just moving the problem from one place to another doing so.

I'm afraid that MVCC should be implemented at the lowest level...

PS : I'm currently looking at hawtDB 

Emmanuel L├ęcharny

View raw message