cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chiradeep Vittal <Chiradeep.Vit...@citrix.com>
Subject Re: [DISCUSS] Transaction Hell
Date Wed, 09 Oct 2013 17:38:37 GMT
+1 to option B (for a lot of the reasons enunciated by Darren).
Also, let's get this in right away so that by 1/31/2014 we are confident
about the change and fixed any bugs uncovered by the new scheme.

On 10/9/13 10:29 AM, "Darren Shepherd" <darren.s.shepherd@gmail.com> wrote:

>Pedro,
>
>From a high level I think we'd probably agree.  Generally I feel an
>IaaS platform is largely a metadata management framework that stores
>the "desired" state of the infrastructure and then pro-actively tries
>to reconcile the desired state with reality.  So failures should be
>recovered from easily as inconsistency will be discovered and
>reconciled.  Having sad that, ACS is not at all like that.  It is very
>task oriented.  Hopefully I/we/everyone can change that, its a huge
>concern of mine.  The general approach in ACS I see is do task X and
>hopefully it works.  If it doesn't work, well hopefully we didn't
>leave things in an inconsistent state.  If we find it does leave
>things in an inconsistent state, write a cleanup thread to fix bad
>things in bad states....
>
>Regarding TX specifically.  This is a huge topic.  I really don't know
>where to start.  I have so many complaints with the data access in
>ACS.  There's what I'd like to see, but its so far from what it really
>is.  Instead I'll address specifically your question.
>
>I wish we were doing transaction per API, but I don't think that was
>ever a consideration.  I do think the sync portion of API commands
>should be wrapped in a single transaction.  I really think the
>original intention of the Transaction framework was to assist in
>cleaning up resources that people always forget to close.  I think
>that is mostly it.
>
>The general guidelines of how I'd like transactions to work would be
>
>1) Synchronous portions of API commands are wrapped in a single
>transaction.  Transaction propagation capability from spring tx can
>then handle nesting transaction as more complicated transaction
>management may be need in certain places.
>
>2) Async jobs that run in a background threads should do small fine
>grained transaction management.  Ideally no transactions.
>Transactions should not be used as a locking mechanism.
>
>Having said that, there are currently so many technical issues in
>getting to that.  For example, with point 1, because IPC/MessageBus
>and EventBus were added recently, that makes it difficult to do 1.
>The problem is that you can't send a message while a DB tx is open
>because the reciever may get the message before the commit.  So
>messaging frameworks have to be written in consideration of the
>transaction management.  Not saying you need to do complex XA style
>transactions, there's simpler ways to do that.  So regarding points 1
>and 2 I said.  That's what I'd like to see, but I know its a long road
>to that.
>
>Option B is really about introducing an API that will eventually serve
>as a lightweight wrapper around Spring TX.  In the short term, if I do
>option B, the implementation of the code will still be the custom ACS
>TX mgmt.  So across modules, its sorta kinda works but not really.
>But if I do the second step of replacing custom ACS TX impl with
>Spring TX, it will follow how Spring TX works.  If we have Sprint TX
>we can then leverage the transaction propagation features of it to
>more sanely handle transaction nesting.
>
>I feel I went a bit the weeds with that response, but maybe something
>in there made sense.
>
>Darren
>
>On Wed, Oct 9, 2013 at 9:31 AM, Pedro Roque Marques
><pedro.r.marques@gmail.com> wrote:
>> Darren,
>> My assumption when I tried to make sense of the transaction code is
>>that the underlying motivation is that the code is trying to create a
>>transaction per API call and then allow multiple modules to implement
>>that API call...
>> i.e. the intent is do use a bit of what i would call a "web-server
>>logic"...
>>
>> 1. API call starts.
>> 2. Module X starts transaction...
>> 3. Module Y does some other changes in the DB...
>> 4. Either the API call completes successfully or not... commit or error
>>back to the user.
>>
>>  I suspect that this was probably the starting point... but it doesn't
>>really work as i describe above. Often when the plugin i'm working on
>>screws up (or XenServer is misconfigured) one ends up with DB objects in
>>inconsistent state.
>>
>> I suspect that the DB Transaction design needs to include what is the
>>methodology for the design of the management server.
>>
>> In an ideal world, i would say that API calls just check authorization
>>and quotas and should store the intent of the management server to reach
>>the desired state. State machines that can then deal with transient
>>failures should then attempt to move the state of the system to the
>>state intended by the user. That however doesn't seem to reflect the
>>current state of the management server.
>>
>> I may be completely wrong... Can you give an example in proposal B of
>>how a transaction would span multiple modules of code ?
>>
>>   Pedro.
>>
>> On Oct 9, 2013, at 1:44 AM, Darren Shepherd wrote:
>>
>>> Okay, please read this all, this is important...  I want you all to
>>> know that its personally important to me to attempt to get rid of ACS
>>> custom stuff and introduce patterns, frameworks, libraries, etc that I
>>> feel are more consistent with modern Java development and are
>>> understood by a wider audience.  This is one of the basic reasons I
>>> started the spring-modularization branch.  I just want to be able to
>>> leverage Spring in a sane way.  The current implementation in ACS is
>>> backwards and broken and abuses Spring to the point that leveraging
>>> Spring isn't really all that possible.
>>>
>>> So while I did the Spring work, I also started laying the ground work
>>> to get rid of the ACS custom transaction management.  The custom DAO
>>> framework and the corresponding transaction management has been a huge
>>> barrier to me extending ACS in the past.  When you look at how you are
>>> supposed to access the database, it's all very custom and what I feel
>>> isn't really all that straight forward.  I was debugging an issue
>>> today and figured out there is a huge bug in what I've done and that
>>> has lead me down this rabbit hole of what the correct solution is.
>>> Additionally ACS custom transaction mgmt is done in a way that
>>> basically breaks Spring too.
>>>
>>> At some point on the mailing list there was a small discussion about
>>> removing the @DB interceptor.  The @DB interceptor does txn.open() and
>>> txn.close() around a method.  If a method forgets to commit or
>>> rollback the txn, txn.close() will rollback the transaction for the
>>> method.  So the general idea of the change was to instead move that
>>> logic to the bottom of the call stack.  The assumption being that the
>>> @DB code was just an additional check to ensure the programmer didn't
>>> forget something and we could instead just do that once at the bottom
>>> of the stack.  Oh how wrong I was.
>>>
>>> The problem is that developers have relied on the @DB interceptor to
>>> handle rollback for them.  So you see the following code quite a bit
>>>
>>> txn.start()
>>> ...
>>> txn.commit()
>>>
>>> And there is no sign of a rollback anywhere.  So the rollback will
>>> happen if some exception is thrown.  By moving the @DB logic to the
>>> bottom of stack what happens is the transaction is not rolled back
>>> when the developer thought it would and madness ensues.  So that
>>> change was bad.  So what to do....  Here's my totally bias description
>>> of solutions:
>>>
>>> Option A or "Custom Forever!":  Go back to custom ACS AOP and the @DB.
>>> This is what one would think is the simplest and safest solution.
>>> We'll it ain't really.  Here's the killer problem, besides that fact
>>> that it makes me feel very sad inside, the current rollback behavior
>>> is broken in certain spots in ACS.  While investigating possible
>>> solutions I started looking at all the places that do programmatic txn
>>> management.  It's important to realize that the txn framework only
>>> works properly if the method in which you do txn.start() has @DB on
>>> it.  There is a java assert in currentTxn() that attempts to make sure
>>> that @DB is there.  But.... nobody runs with asserts on.  So there are
>>> places in ACS where transactions are started and no @DB is there, but
>>> it happens to work because some method in the stack has @DB.  So to
>>> properly go back to option A we really need to fix all places that
>>> don't have @DB, plus make sure people always run with asserts on.  And
>>> then give up making the ACS world a better place and just do things
>>> how we always have...
>>>
>>> Option B or "Progress is Good":  The current transaction management
>>> approach (especially rollback) doesn't match how the majority of
>>> frameworks out there work.  This option is to change the Transaction
>>> class API to be more consistent with standard frameworks out there.  I
>>> propose the following APIs (if you know Spring TX mgmt, this will look
>>> familiar)
>>>
>>> 1) remove start(), commit(), rollback() - The easiest way to ensure we
>>> up date everything properly is to break the API and fix everything
>>> that is broken (about 433 places)
>>> 2) Add execute(TransactionCallback) where TransactionCallback has one
>>> method doInTransaction().  For somebody to run a transaction you would
>>> need to do
>>>
>>> txn.execute(new TransactionCallback() {
>>> Object doInTransaction() {
>>>  // do stuff
>>> }
>>> })
>>> 3) add "Object startTransaction()," commit(Object), and
>>> rollback(Object) - These methods are for callers who really really
>>> want to do thing programmatically.  To run a transaction you would do
>>>
>>> Object status = txn.startTransaction()
>>> try {
>>>  //.. do stuff
>>>  txn.commit(status)
>>> } catch (Exception e) {
>>>  txn.rollback(status)
>>> }
>>>
>>> I'm perfectly willing to go and change all the code for this.  It will
>>> just take a couple hours or so.  Option B is purposely almost exactly
>>> like Spring PlatformTransactionManager.  The reason being if we switch
>>> all the code to this style, we can later drop the implementation of
>>> Transaction and move to 100% fully Spring TX managed.
>>>
>>> Just as a final point, every custom approach or framework we have adds
>>> a barrier to people extending ACS and additionally puts more burden on
>>> the ACS community as that is more code we have to support.  If
>>> somebody today wants to know how DB transaction propagation works,
>>> there's zero documentation on it and probably 2 people who know how it
>>> works.  If we use a standard framework then somebody can just refer to
>>> that frameworks documentation, community, or stackexchange.
>>>
>>> So either option we can do and I'm opening this up to the community to
>>> decide.  If we go with Option A, a small portion of me will die
>>> inside, but so be it.
>>>
>>> Darren
>>


Mime
View raw message