cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Tutkowski <>
Subject Re: [DISCUSS] Transaction Hell
Date Wed, 09 Oct 2013 19:07:14 GMT
I think Option B is a good move.

On Wed, Oct 9, 2013 at 1:00 PM, Darren Shepherd <
> wrote:

> This is blocking my spring-modularization branch as I coupled changing
> the DB transactions stuff with the spring stuff (maybe not the best
> idea...).  So if everyone is on board with option 2, I'll look to get
> it done by probably Tuesday.  I'll create a branch from master,
> independent of the spring stuff I did.  If all is swell with that,
> merge the DB txn change, and merge the spring changes.
> Darren
> On Wed, Oct 9, 2013 at 10:38 AM, Chiradeep Vittal
> <> wrote:
> > +1 to option B (for a lot of the reasons enunciated by Darren).
> > Also, let's get this in right away so that by 1/31/2014 we are confident
> > about the change and fixed any bugs uncovered by the new scheme.
> >
> > On 10/9/13 10:29 AM, "Darren Shepherd" <>
> wrote:
> >
> >>Pedro,
> >>
> > >From a high level I think we'd probably agree.  Generally I feel an
> >>IaaS platform is largely a metadata management framework that stores
> >>the "desired" state of the infrastructure and then pro-actively tries
> >>to reconcile the desired state with reality.  So failures should be
> >>recovered from easily as inconsistency will be discovered and
> >>reconciled.  Having sad that, ACS is not at all like that.  It is very
> >>task oriented.  Hopefully I/we/everyone can change that, its a huge
> >>concern of mine.  The general approach in ACS I see is do task X and
> >>hopefully it works.  If it doesn't work, well hopefully we didn't
> >>leave things in an inconsistent state.  If we find it does leave
> >>things in an inconsistent state, write a cleanup thread to fix bad
> >>things in bad states....
> >>
> >>Regarding TX specifically.  This is a huge topic.  I really don't know
> >>where to start.  I have so many complaints with the data access in
> >>ACS.  There's what I'd like to see, but its so far from what it really
> >>is.  Instead I'll address specifically your question.
> >>
> >>I wish we were doing transaction per API, but I don't think that was
> >>ever a consideration.  I do think the sync portion of API commands
> >>should be wrapped in a single transaction.  I really think the
> >>original intention of the Transaction framework was to assist in
> >>cleaning up resources that people always forget to close.  I think
> >>that is mostly it.
> >>
> >>The general guidelines of how I'd like transactions to work would be
> >>
> >>1) Synchronous portions of API commands are wrapped in a single
> >>transaction.  Transaction propagation capability from spring tx can
> >>then handle nesting transaction as more complicated transaction
> >>management may be need in certain places.
> >>
> >>2) Async jobs that run in a background threads should do small fine
> >>grained transaction management.  Ideally no transactions.
> >>Transactions should not be used as a locking mechanism.
> >>
> >>Having said that, there are currently so many technical issues in
> >>getting to that.  For example, with point 1, because IPC/MessageBus
> >>and EventBus were added recently, that makes it difficult to do 1.
> >>The problem is that you can't send a message while a DB tx is open
> >>because the reciever may get the message before the commit.  So
> >>messaging frameworks have to be written in consideration of the
> >>transaction management.  Not saying you need to do complex XA style
> >>transactions, there's simpler ways to do that.  So regarding points 1
> >>and 2 I said.  That's what I'd like to see, but I know its a long road
> >>to that.
> >>
> >>Option B is really about introducing an API that will eventually serve
> >>as a lightweight wrapper around Spring TX.  In the short term, if I do
> >>option B, the implementation of the code will still be the custom ACS
> >>TX mgmt.  So across modules, its sorta kinda works but not really.
> >>But if I do the second step of replacing custom ACS TX impl with
> >>Spring TX, it will follow how Spring TX works.  If we have Sprint TX
> >>we can then leverage the transaction propagation features of it to
> >>more sanely handle transaction nesting.
> >>
> >>I feel I went a bit the weeds with that response, but maybe something
> >>in there made sense.
> >>
> >>Darren
> >>
> >>On Wed, Oct 9, 2013 at 9:31 AM, Pedro Roque Marques
> >><> wrote:
> >>> Darren,
> >>> My assumption when I tried to make sense of the transaction code is
> >>>that the underlying motivation is that the code is trying to create a
> >>>transaction per API call and then allow multiple modules to implement
> >>>that API call...
> >>> i.e. the intent is do use a bit of what i would call a "web-server
> >>>logic"...
> >>>
> >>> 1. API call starts.
> >>> 2. Module X starts transaction...
> >>> 3. Module Y does some other changes in the DB...
> >>> 4. Either the API call completes successfully or not... commit or error
> >>>back to the user.
> >>>
> >>>  I suspect that this was probably the starting point... but it doesn't
> >>>really work as i describe above. Often when the plugin i'm working on
> >>>screws up (or XenServer is misconfigured) one ends up with DB objects in
> >>>inconsistent state.
> >>>
> >>> I suspect that the DB Transaction design needs to include what is the
> >>>methodology for the design of the management server.
> >>>
> >>> In an ideal world, i would say that API calls just check authorization
> >>>and quotas and should store the intent of the management server to reach
> >>>the desired state. State machines that can then deal with transient
> >>>failures should then attempt to move the state of the system to the
> >>>state intended by the user. That however doesn't seem to reflect the
> >>>current state of the management server.
> >>>
> >>> I may be completely wrong... Can you give an example in proposal B of
> >>>how a transaction would span multiple modules of code ?
> >>>
> >>>   Pedro.
> >>>
> >>> On Oct 9, 2013, at 1:44 AM, Darren Shepherd wrote:
> >>>
> >>>> Okay, please read this all, this is important...  I want you all to
> >>>> know that its personally important to me to attempt to get rid of ACS
> >>>> custom stuff and introduce patterns, frameworks, libraries, etc that
> >>>> feel are more consistent with modern Java development and are
> >>>> understood by a wider audience.  This is one of the basic reasons I
> >>>> started the spring-modularization branch.  I just want to be able to
> >>>> leverage Spring in a sane way.  The current implementation in ACS is
> >>>> backwards and broken and abuses Spring to the point that leveraging
> >>>> Spring isn't really all that possible.
> >>>>
> >>>> So while I did the Spring work, I also started laying the ground work
> >>>> to get rid of the ACS custom transaction management.  The custom DAO
> >>>> framework and the corresponding transaction management has been a huge
> >>>> barrier to me extending ACS in the past.  When you look at how you are
> >>>> supposed to access the database, it's all very custom and what I feel
> >>>> isn't really all that straight forward.  I was debugging an issue
> >>>> today and figured out there is a huge bug in what I've done and that
> >>>> has lead me down this rabbit hole of what the correct solution is.
> >>>> Additionally ACS custom transaction mgmt is done in a way that
> >>>> basically breaks Spring too.
> >>>>
> >>>> At some point on the mailing list there was a small discussion about
> >>>> removing the @DB interceptor.  The @DB interceptor does and
> >>>> txn.close() around a method.  If a method forgets to commit or
> >>>> rollback the txn, txn.close() will rollback the transaction for the
> >>>> method.  So the general idea of the change was to instead move that
> >>>> logic to the bottom of the call stack.  The assumption being that the
> >>>> @DB code was just an additional check to ensure the programmer didn't
> >>>> forget something and we could instead just do that once at the bottom
> >>>> of the stack.  Oh how wrong I was.
> >>>>
> >>>> The problem is that developers have relied on the @DB interceptor to
> >>>> handle rollback for them.  So you see the following code quite a bit
> >>>>
> >>>> txn.start()
> >>>> ...
> >>>> txn.commit()
> >>>>
> >>>> And there is no sign of a rollback anywhere.  So the rollback will
> >>>> happen if some exception is thrown.  By moving the @DB logic to the
> >>>> bottom of stack what happens is the transaction is not rolled back
> >>>> when the developer thought it would and madness ensues.  So that
> >>>> change was bad.  So what to do....  Here's my totally bias description
> >>>> of solutions:
> >>>>
> >>>> Option A or "Custom Forever!":  Go back to custom ACS AOP and the @DB.
> >>>> This is what one would think is the simplest and safest solution.
> >>>> We'll it ain't really.  Here's the killer problem, besides that fact
> >>>> that it makes me feel very sad inside, the current rollback behavior
> >>>> is broken in certain spots in ACS.  While investigating possible
> >>>> solutions I started looking at all the places that do programmatic txn
> >>>> management.  It's important to realize that the txn framework only
> >>>> works properly if the method in which you do txn.start() has @DB on
> >>>> it.  There is a java assert in currentTxn() that attempts to make sure
> >>>> that @DB is there.  But.... nobody runs with asserts on.  So there are
> >>>> places in ACS where transactions are started and no @DB is there, but
> >>>> it happens to work because some method in the stack has @DB.  So to
> >>>> properly go back to option A we really need to fix all places that
> >>>> don't have @DB, plus make sure people always run with asserts on.  And
> >>>> then give up making the ACS world a better place and just do things
> >>>> how we always have...
> >>>>
> >>>> Option B or "Progress is Good":  The current transaction management
> >>>> approach (especially rollback) doesn't match how the majority of
> >>>> frameworks out there work.  This option is to change the Transaction
> >>>> class API to be more consistent with standard frameworks out there.
> >>>> propose the following APIs (if you know Spring TX mgmt, this will look
> >>>> familiar)
> >>>>
> >>>> 1) remove start(), commit(), rollback() - The easiest way to ensure
> >>>> up date everything properly is to break the API and fix everything
> >>>> that is broken (about 433 places)
> >>>> 2) Add execute(TransactionCallback) where TransactionCallback has one
> >>>> method doInTransaction().  For somebody to run a transaction you would
> >>>> need to do
> >>>>
> >>>> txn.execute(new TransactionCallback() {
> >>>> Object doInTransaction() {
> >>>>  // do stuff
> >>>> }
> >>>> })
> >>>> 3) add "Object startTransaction()," commit(Object), and
> >>>> rollback(Object) - These methods are for callers who really really
> >>>> want to do thing programmatically.  To run a transaction you would do
> >>>>
> >>>> Object status = txn.startTransaction()
> >>>> try {
> >>>>  //.. do stuff
> >>>>  txn.commit(status)
> >>>> } catch (Exception e) {
> >>>>  txn.rollback(status)
> >>>> }
> >>>>
> >>>> I'm perfectly willing to go and change all the code for this.  It will
> >>>> just take a couple hours or so.  Option B is purposely almost exactly
> >>>> like Spring PlatformTransactionManager.  The reason being if we switch
> >>>> all the code to this style, we can later drop the implementation of
> >>>> Transaction and move to 100% fully Spring TX managed.
> >>>>
> >>>> Just as a final point, every custom approach or framework we have adds
> >>>> a barrier to people extending ACS and additionally puts more burden
> >>>> the ACS community as that is more code we have to support.  If
> >>>> somebody today wants to know how DB transaction propagation works,
> >>>> there's zero documentation on it and probably 2 people who know how
> >>>> works.  If we use a standard framework then somebody can just refer
> >>>> that frameworks documentation, community, or stackexchange.
> >>>>
> >>>> So either option we can do and I'm opening this up to the community
> >>>> decide.  If we go with Option A, a small portion of me will die
> >>>> inside, but so be it.
> >>>>
> >>>> Darren
> >>>
> >

*Mike Tutkowski*
*Senior CloudStack Developer, SolidFire Inc.*
o: 303.746.7302
Advancing the way the world uses the

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message