From Selcuk AYA <>
Subject merging txn branch back into the trunk
Date Mon, 07 May 2012 09:04:45 GMT
Hi All,

I think we have a good checkpoint for the transaction branch and we
can merge it back into the trunk. There are significant changes to the
server code and it would be good to review this and understand what
the impact will be as much as possible. Below is a summary of the
general motivation and what has been done so far and affect of it on
the current code.


*** The general motive is to implement a logical transaction layer
above partitions. This transaction layer keeps a log of logical
changes. It works above partitions and expects partitions to  conform
to a certain data model and give some consistency guarantees:

                       1) Partitions should expose a MasterTable which
stores Entry objects. They also should expose the expected system
indices and can expose user indices. Master and Index tables are
basically (key, value) tables.
                       2) Supported modification operations on Master
and Index tables should be atomic
                       3) Lookup on Master and Index tables by their
key should return consistent data
                       4) Scan on Master and Index tables should
return committed data(this does not necessarily mean scan will return
data with snapshot consistency, it might return data committed since
it began scanning).

Towards this goal, the following changes have been made:

                      1) JDBM has been changed to provide the above
consistency requirements(done and merged separately).
                      2)  AVL partition has been rewritten to provide
the above consistency guarantees.
                      3) Partition interface has been changed to
expose its MasterTable and the necessary indices and user indices.

***Another goal in this effort was to move the operation execution
logic above partitions so that transactions could be implemented
independent of partitions with the above consistency guarantees.
Towards this goal, DefaultOperationExecutionManager has been
introduced. This mostly copies the logic in AbstractBTreePartition and
executes updates on master tables and indices using log edits rather
than direct updates on them. These log edits are then applied to
partitions using the atomic modifications on Master and Index tables
exposed by partitions. This class is where we really need to review
well to ensure things are OK.

***Partitions still can use any search engine they want and get
transactionally consistent data using the methods exposed by
TxnManager. For DefaultSearchEngine, this was done using Index and
MasterTable wrappers which merged what was read from underlying
partitions with what is in the txn logs. So the search engine code
worked pretty much without any significant change(except the changes
to remove generics and use UUID as described below).

*** To make implementing transactional layer easier, UUID is used as
the key for all partitions. A side effect of this is generics are
mostly removed from the code. So for example index interface is
changed to:
          UUID forwardLookup( K attrVal ) throws Exception;
          K reverseLookup( UUID id ) throws Exception;

*** There was a sync method which we called on partitions every x ms.
This has been removed(flushing of partitions handles it)

*** all logical data changes(including schema registries) are
implemented using a single lock. Locks for individual caches are
removed(For example referral management lock is not used)

********OTHER CHANGES*******************

 Rest of the changes have less impact on the existing code. The guts
of the transactional management system is implemented at core.txn
package. It is OCC+MVCC(except the gross hack we use for logical data
handling).  It uses a logging system to do WAL logging.

******OPEN ISSUES***********

Will send another email about these.


- One and Sub level index removal will probably impact
- From emails, I understand the Index interface assuming UUID might be
a problem with the recent changes. Maybe this needs to be changed too.

Either before or both before and after the merge, we should run a test
with concurrent threads(read+write) and clear out any remaining


After the merge, I will work on implementing crash recovery part which
will complete the transactional layer changes.


