From dev-return-57241-archive-asf-public=cust-asf.ponee.io@directory.apache.org  Wed Feb 21 23:01:42 2018
Return-Path: <dev-return-57241-archive-asf-public=cust-asf.ponee.io@directory.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
	by mx-eu-01.ponee.io (Postfix) with SMTP id 9519218061A
	for <archive-asf-public@cust-asf.ponee.io>; Wed, 21 Feb 2018 23:01:41 +0100 (CET)
Received: (qmail 10449 invoked by uid 500); 21 Feb 2018 22:01:40 -0000
Mailing-List: contact dev-help@directory.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:dev-help@directory.apache.org>
List-Unsubscribe: <mailto:dev-unsubscribe@directory.apache.org>
List-Post: <mailto:dev@directory.apache.org>
List-Id: <dev.directory.apache.org>
Reply-To: "Apache Directory Developers List" <dev@directory.apache.org>
Delivered-To: mailing list dev@directory.apache.org
Received: (qmail 10439 invoked by uid 99); 21 Feb 2018 22:01:40 -0000
Received: from ui-eu-01.ponee.io (HELO localhost) (176.9.59.70)
    by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Feb 2018 22:01:40 +0000
Message-ID: <pony-9d55c29c020f0607a252243a305e8fed47d50b37-477d0e6bf050c72bbd2746233e287cd656614234@dev.directory.apache.org>
MIME-Version: 1.0
Subject: Update : [ApacheDS] Transaction support
From: Emmanuel Lecharny <elecharny@apache.org>
x-ponymail-sender: 9d55c29c020f0607a252243a305e8fed47d50b37
Date: Wed, 21 Feb 2018 22:01:38 -0000
In-Reply-To: <150470af-7043-8810-5f1d-b846982150a3@gmail.com>
X-Mailer: LuaSocket 3.0-rc1
Content-Type: text/plain; charset="iso-8859-1"
To: <dev@directory.apache.org>
References: <150470af-7043-8810-5f1d-b846982150a3@gmail.com> <d2c63db5-69a1-ffc5-1cfb-da6d111cd543@gmail.com>
 <pony-9d55c29c020f0607a252243a305e8fed47d50b37-8a1371e2caf15dca5216075aaf3bf9d211d7f190@dev.directory.apache.org>
x-ponymail-agent: PonyMail Composer/0.2

Hi,

a quick update on the on-going work (last week-end and last night).

Most of the core-integ tests are passing now (except 3 errors and 4 failures). FTR, there are 335 instances of OperationContext to check in the whole code, many of them using a readTransaction (149).

I'm facing 2 issues atm :
- first, the Search operation must not close the transaction until the associated cursor is closed. This is not currently properly done. The best solution I foresee is to store the transaction in the Cursor, and to close it when the cursor is closed. That would imply we may face some issue if the cursor remains open for a long time.
- second, as we don't have a file per index anymore, the JdbmPartition initialization is now problematic : we are checking the presence of such files on disk to see if we have to re-create the missing indexes. IMO, this is a wrong thing to do : we should check the configuration and create the index accordingly if the index aren't present in the database. That also mean we should check in the database, something we currently don't do. This is something I have to fix, it's blocking some of the tests (and likely the server won't restart, too).

Otherwise, I have checked all the operations, before the OperationManager and after it, the transaction seems to be properly initiated and committed or aborted. 

I'm not far from something that works...

More later.

On 2018/02/17 17:26:05, Emmanuel Lécharny <elecharny@gmail.com> wrote: 
> Hi,
> 
> this is an update on the on-going work.
> 
> I have started to add support of transaction in JDBM first, using the
> changes I have applied last week.
> 
> Actually, JDBM supports transaction natively (to some extend), and teh
> big mistake we have done was to limit its use on a per-index/table
> level. Which means each index and the master table are updated within
> its own transaction, which is clearly not good when we have to make each
> LDAP operation atomic.
> 
> So the first big change is that I use one single transaction manager
> across all the index and the master table : we commit an operation
> globally. I do expect that it will make it safer (ie, no concurrent
> updates of indexes, leading to some database corruption).
> 
> A typical usage looks like (in OperationManager) :
> 
> public void add( AddOperationContext addContext ) throws LdapException
> {
>     ...
>     // Find the working partition
>     Partition partition =
> directoryService.getPartitionNexus().getPartition( dn );
>     addContext.setPartition( partition );
>     ...
> 
>     lockWrite();
> 
>     // Start a Write transaction right away
>     PartitionTxn transaction = null;
> 
>     try
>     {
>         transaction = partition.beginWriteTransaction();
>         addContext.setTransaction( transaction );
> 
>         head.add( addContext );
>         transaction.commit();
>     }
>     catch ( LdapException | IOException e )
>     {
>         try
>             {
>             if ( transaction != null )
>             {
>                 transaction.abort();
>             }
> 
>             throw new LdapOtherException( e.getMessage(), e );
>         }
>         catch ( IOException ioe )
>         {
>             throw new LdapOtherException( ioe.getMessage(), ioe );
>         }
>     }
>     finally
>     {
>         unlockWrite();
>     }
> 
> It sounds a bit convoluted, but what we do is :
> 
> - get the Partition we are working on at this level instead of waiting
> to be in the Parition itself. That makes sense because an update
> operation (even a move) can't be done across partitions. Note that we
> don't fetch the Partition in the Nexus, as we already have it
> 
> - then we ask the selected partition to start a write transaction. What
> is interesting here is that not all the partition will create this
> transaction. Typically, the Schema partition won't - it makes no sense
> -, nor the config partition : they have in-memory indexes, and the
> master table is a LDIF file, it's either properly updated or not.
> 
> - the OperationContext stores the transaction and the partition : the
> transaction may be reused by interceptors (typically, we may have to
> lookup some other entries, or updates entries while processing the
> on-going peration)
> 
> - then we call the first interceptor, down to the nexus.
> 
> - If all goes well, the transaction is committed, otherwise, it's aborted
> 
> 
> For JDBM, a commit looks like that :
> 
>     public void commit() throws IOException
>     {
>         recordManager.commit();
> 
>         // And flush the journal
>         BaseRecordManager baseRecordManager = null;
> 
>         if ( recordManager instanceof CacheRecordManager )
>         {
>             baseRecordManager = ( ( BaseRecordManager ) ( (
> CacheRecordManager ) recordManager ).getRecordManager() );
>         }
>         else
>         {
>             baseRecordManager = ( ( BaseRecordManager ) recordManager );
>         }
> 
> 
>         if ( syncOnWrite )
>         {
>             baseRecordManager.getTransactionManager().synchronizeLog();
>         }
>     }
> 
> - First the RecordManager is committed, which flush the updates into the
> Log file (first disk write). Up to this point, all the updates are done
> in memory.
> 
> - then we ask the RecordManager to update the databse from the log.
> 
> And that's it.
> 
> Aborting the operaiton is way simpler :
> 
>     public void abort() throws IOException
>     {
>         recordManager.rollback();
>     }
> 
> 
> I have removed all the sync() methods called all over the code, they are
> totally useless now.  Bottom line, we end up with having one single file
> storing the data plus a log file containing the on going updates. If the
> server crashes, I expect the server to catch up from the journal (still
> have to check that aspect).
> 
> Extra benefit : it's *WAY* faster. The perf test I have shows that
> adding entries in the server using this code is 2 times faster than with
> the previous approach (ok, don't get too excited, I'm talking about 180
> add/s vs 98 add/s)
> 
> 
> There are a lot to do still. The Read transactions also need to be dealt
> with. A typicall read looks lik that :
> 
>     ...
>     LookupOperationContext lookupContext = new LookupOperationContext(
>   adminSession, modifyContext.getDn(), "+", "*" );
>     lookupContext.setPartition( modifyContext.getPartition() );
>     lookupContext.setTransaction( modifyContext.getTransaction() );
> 
>     entry = directoryService.getPartitionNexus().lookup( lookupContext );
>     ...
> 
> here we reuse the modify operation's transaction.
> 
> Or :
> 
>     LookupOperationContext lookupContext = new LookupOperationContext(
> adminSession, lookupContext.setPartition( partition );
> 
>     try ( PartitionTxn partitionTxn = partition.beginReadTransaction() )
>     {
>         lookupContext.setTransaction( partitionTxn );
>         adminGroup = nexus.lookup( lookupContext );
>     }
>     catch ( IOException ioe )
>     {
>         throw new LdapOtherException( ioe.getMessage(), ioe );
>     }
> 
> Here, we create a new Read transaction in a try-with-resource call, this
> transaction will be automatically closed. Note we don't have to commit
> or abort it.
> 
> 
> So now, I'm fixing the tests, changing all the places in the code where
> I need to start a read transaction (around 80 places for the lookup
> operation, for instance, then there is search, hasEntry, etc...),
> properly encapsulating the update operations. That will take time...
> 
> Once it will work for JDBM, I will have to get it to work for Mavibot.
> 
> Still, this is encouraging.
> 
> have a nice week-end.
> 
> -- 
> Emmanuel Lecharny
> 
> Symas.com
> directory.apache.org
> 
>