directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel Lecharny <elecha...@apache.org>
Subject Partly done : [ApacheDS] Transaction support
Date Mon, 26 Feb 2018 10:27:17 GMT
Hi guys,

I'm done with the changes in the server : all the tests are now passing green.

That means we have a JDBM based server with operation being atomic. That should solve the
database corruption we have.

In the process, I also changed the way the API handle requests/responses, modifying the operation
Future, removing queues that were used expect for operations that could take more than one
response (Bind, Search and ExtendedOperation). We now use a simple wait/notify solution that
works well. The Cursor implementation has also been slightly modified to remove the monitor
in it, as it was not sued. Last not least, the operation Future.isDone() method is now returning
'true' if we have got an operation response. This is important for asunc implementation as
it's a not blocking way to know if we can go on and fetch the operation (if it's not abandoned,
of course).

Otherwise, I also have removed a useless feature in the JDBM Parttition : we were storing
the number of elements in a B-tree beside the B-tree, which is totally spurious as teh B-tree
already knows about this number : tehre is a btree.size() method. This save a few writes.

Overall, as I said, the server is 20% faster, and this is due to saved writes and to changes
in the way the  LDAP API handles values.

I still have some cleanup to do, but I think we are on a verge of a release of teh API and
the server, which will also leads to a release of Studio :-)

On 2018/02/21 22:01:38, Emmanuel Lecharny <elecharny@apache.org> wrote: 
> Hi,
> 
> a quick update on the on-going work (last week-end and last night).
> 
> Most of the core-integ tests are passing now (except 3 errors and 4 failures). FTR, there
are 335 instances of OperationContext to check in the whole code, many of them using a readTransaction
(149).
> 
> I'm facing 2 issues atm :
> - first, the Search operation must not close the transaction until the associated cursor
is closed. This is not currently properly done. The best solution I foresee is to store the
transaction in the Cursor, and to close it when the cursor is closed. That would imply we
may face some issue if the cursor remains open for a long time.
> - second, as we don't have a file per index anymore, the JdbmPartition initialization
is now problematic : we are checking the presence of such files on disk to see if we have
to re-create the missing indexes. IMO, this is a wrong thing to do : we should check the configuration
and create the index accordingly if the index aren't present in the database. That also mean
we should check in the database, something we currently don't do. This is something I have
to fix, it's blocking some of the tests (and likely the server won't restart, too).
> 
> Otherwise, I have checked all the operations, before the OperationManager and after it,
the transaction seems to be properly initiated and committed or aborted. 
> 
> I'm not far from something that works...
> 
> More later.
> 
> On 2018/02/17 17:26:05, Emmanuel Lécharny <elecharny@gmail.com> wrote: 
> > Hi,
> > 
> > this is an update on the on-going work.
> > 
> > I have started to add support of transaction in JDBM first, using the
> > changes I have applied last week.
> > 
> > Actually, JDBM supports transaction natively (to some extend), and teh
> > big mistake we have done was to limit its use on a per-index/table
> > level. Which means each index and the master table are updated within
> > its own transaction, which is clearly not good when we have to make each
> > LDAP operation atomic.
> > 
> > So the first big change is that I use one single transaction manager
> > across all the index and the master table : we commit an operation
> > globally. I do expect that it will make it safer (ie, no concurrent
> > updates of indexes, leading to some database corruption).
> > 
> > A typical usage looks like (in OperationManager) :
> > 
> > public void add( AddOperationContext addContext ) throws LdapException
> > {
> >     ...
> >     // Find the working partition
> >     Partition partition =
> > directoryService.getPartitionNexus().getPartition( dn );
> >     addContext.setPartition( partition );
> >     ...
> > 
> >     lockWrite();
> > 
> >     // Start a Write transaction right away
> >     PartitionTxn transaction = null;
> > 
> >     try
> >     {
> >         transaction = partition.beginWriteTransaction();
> >         addContext.setTransaction( transaction );
> > 
> >         head.add( addContext );
> >         transaction.commit();
> >     }
> >     catch ( LdapException | IOException e )
> >     {
> >         try
> >             {
> >             if ( transaction != null )
> >             {
> >                 transaction.abort();
> >             }
> > 
> >             throw new LdapOtherException( e.getMessage(), e );
> >         }
> >         catch ( IOException ioe )
> >         {
> >             throw new LdapOtherException( ioe.getMessage(), ioe );
> >         }
> >     }
> >     finally
> >     {
> >         unlockWrite();
> >     }
> > 
> > It sounds a bit convoluted, but what we do is :
> > 
> > - get the Partition we are working on at this level instead of waiting
> > to be in the Parition itself. That makes sense because an update
> > operation (even a move) can't be done across partitions. Note that we
> > don't fetch the Partition in the Nexus, as we already have it
> > 
> > - then we ask the selected partition to start a write transaction. What
> > is interesting here is that not all the partition will create this
> > transaction. Typically, the Schema partition won't - it makes no sense
> > -, nor the config partition : they have in-memory indexes, and the
> > master table is a LDIF file, it's either properly updated or not.
> > 
> > - the OperationContext stores the transaction and the partition : the
> > transaction may be reused by interceptors (typically, we may have to
> > lookup some other entries, or updates entries while processing the
> > on-going peration)
> > 
> > - then we call the first interceptor, down to the nexus.
> > 
> > - If all goes well, the transaction is committed, otherwise, it's aborted
> > 
> > 
> > For JDBM, a commit looks like that :
> > 
> >     public void commit() throws IOException
> >     {
> >         recordManager.commit();
> > 
> >         // And flush the journal
> >         BaseRecordManager baseRecordManager = null;
> > 
> >         if ( recordManager instanceof CacheRecordManager )
> >         {
> >             baseRecordManager = ( ( BaseRecordManager ) ( (
> > CacheRecordManager ) recordManager ).getRecordManager() );
> >         }
> >         else
> >         {
> >             baseRecordManager = ( ( BaseRecordManager ) recordManager );
> >         }
> > 
> > 
> >         if ( syncOnWrite )
> >         {
> >             baseRecordManager.getTransactionManager().synchronizeLog();
> >         }
> >     }
> > 
> > - First the RecordManager is committed, which flush the updates into the
> > Log file (first disk write). Up to this point, all the updates are done
> > in memory.
> > 
> > - then we ask the RecordManager to update the databse from the log.
> > 
> > And that's it.
> > 
> > Aborting the operaiton is way simpler :
> > 
> >     public void abort() throws IOException
> >     {
> >         recordManager.rollback();
> >     }
> > 
> > 
> > I have removed all the sync() methods called all over the code, they are
> > totally useless now.  Bottom line, we end up with having one single file
> > storing the data plus a log file containing the on going updates. If the
> > server crashes, I expect the server to catch up from the journal (still
> > have to check that aspect).
> > 
> > Extra benefit : it's *WAY* faster. The perf test I have shows that
> > adding entries in the server using this code is 2 times faster than with
> > the previous approach (ok, don't get too excited, I'm talking about 180
> > add/s vs 98 add/s)
> > 
> > 
> > There are a lot to do still. The Read transactions also need to be dealt
> > with. A typicall read looks lik that :
> > 
> >     ...
> >     LookupOperationContext lookupContext = new LookupOperationContext(
> >   adminSession, modifyContext.getDn(), "+", "*" );
> >     lookupContext.setPartition( modifyContext.getPartition() );
> >     lookupContext.setTransaction( modifyContext.getTransaction() );
> > 
> >     entry = directoryService.getPartitionNexus().lookup( lookupContext );
> >     ...
> > 
> > here we reuse the modify operation's transaction.
> > 
> > Or :
> > 
> >     LookupOperationContext lookupContext = new LookupOperationContext(
> > adminSession, lookupContext.setPartition( partition );
> > 
> >     try ( PartitionTxn partitionTxn = partition.beginReadTransaction() )
> >     {
> >         lookupContext.setTransaction( partitionTxn );
> >         adminGroup = nexus.lookup( lookupContext );
> >     }
> >     catch ( IOException ioe )
> >     {
> >         throw new LdapOtherException( ioe.getMessage(), ioe );
> >     }
> > 
> > Here, we create a new Read transaction in a try-with-resource call, this
> > transaction will be automatically closed. Note we don't have to commit
> > or abort it.
> > 
> > 
> > So now, I'm fixing the tests, changing all the places in the code where
> > I need to start a read transaction (around 80 places for the lookup
> > operation, for instance, then there is search, hasEntry, etc...),
> > properly encapsulating the update operations. That will take time...
> > 
> > Once it will work for JDBM, I will have to get it to work for Mavibot.
> > 
> > Still, this is encouraging.
> > 
> > have a nice week-end.
> > 
> > -- 
> > Emmanuel Lecharny
> > 
> > Symas.com
> > directory.apache.org
> > 
> > 
> 

Mime
View raw message