Return-Path: X-Original-To: apmail-cloudstack-dev-archive@www.apache.org Delivered-To: apmail-cloudstack-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D0F451028A for ; Wed, 9 Oct 2013 18:57:35 +0000 (UTC) Received: (qmail 1996 invoked by uid 500); 9 Oct 2013 18:57:34 -0000 Delivered-To: apmail-cloudstack-dev-archive@cloudstack.apache.org Received: (qmail 1920 invoked by uid 500); 9 Oct 2013 18:57:33 -0000 Mailing-List: contact dev-help@cloudstack.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cloudstack.apache.org Delivered-To: mailing list dev@cloudstack.apache.org Received: (qmail 1903 invoked by uid 99); 9 Oct 2013 18:57:32 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Oct 2013 18:57:32 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of daan.hoogland@gmail.com designates 209.85.223.182 as permitted sender) Received: from [209.85.223.182] (HELO mail-ie0-f182.google.com) (209.85.223.182) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Oct 2013 18:57:25 +0000 Received: by mail-ie0-f182.google.com with SMTP id as1so2354392iec.27 for ; Wed, 09 Oct 2013 11:57:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=6k8omwiynR9QB5gZHqNAWWugLF12mMEE7QAZnXrhDnQ=; b=0NC9Vfunxh/KRRyL8r/YqhVIJTcjdfMoXvzTsB4VtwJL1e3YCY5bqqFNag7v/ILP96 wCtafzMZScFPXk96YZkkSo19xEtOiDNa1/muJOKOZ0rjos8Q2G+oXqDVyxh38aeN5p3k PvJ3lm0b+LXJp2n4cPHy6V3C1xGHk5yhmOn2kyPF5UJn3qBoFrr03SGF7/7/Glj9PMZ6 ymbEAe8PYv8+c7h9vm3RHPy2TL4pjC0eDdpBsZFr5zA5vGZOAtNhac9vCDCINl1l94HH Z6+Cl6JZnr8dIE4Dygfk/eFaD4Cy4m9/stmxsGbcIwB5zFbV4TS/3MnoHy+jwZPt30Gv Q2qA== X-Received: by 10.42.190.142 with SMTP id di14mr2402135icb.45.1381345024125; Wed, 09 Oct 2013 11:57:04 -0700 (PDT) MIME-Version: 1.0 Received: by 10.64.129.38 with HTTP; Wed, 9 Oct 2013 11:56:44 -0700 (PDT) In-Reply-To: References: From: Daan Hoogland Date: Wed, 9 Oct 2013 20:56:44 +0200 Message-ID: Subject: Re: [DISCUSS] Transaction Hell To: dev Content-Type: multipart/alternative; boundary=20cf303ea4b879b9b504e8537373 X-Virus-Checked: Checked by ClamAV on apache.org --20cf303ea4b879b9b504e8537373 Content-Type: text/plain; charset=ISO-8859-1 Darren, Happy to hear your view on Spring. +1 for option B On Wed, Oct 9, 2013 at 8:06 PM, Kelven Yang wrote: > +1 > > Original Transaction class also has many tightly-coupled assumptions about > the underlying data source, lock master. Developers are usually lost on > when and where they should use @DB, for nested transactions, it does not > really work as expected. > > Kelven > > > On 10/9/13 10:38 AM, "Chiradeep Vittal" > wrote: > > >+1 to option B (for a lot of the reasons enunciated by Darren). > >Also, let's get this in right away so that by 1/31/2014 we are confident > >about the change and fixed any bugs uncovered by the new scheme. > > > >On 10/9/13 10:29 AM, "Darren Shepherd" > >wrote: > > > >>Pedro, > >> > >>From a high level I think we'd probably agree. Generally I feel an > >>IaaS platform is largely a metadata management framework that stores > >>the "desired" state of the infrastructure and then pro-actively tries > >>to reconcile the desired state with reality. So failures should be > >>recovered from easily as inconsistency will be discovered and > >>reconciled. Having sad that, ACS is not at all like that. It is very > >>task oriented. Hopefully I/we/everyone can change that, its a huge > >>concern of mine. The general approach in ACS I see is do task X and > >>hopefully it works. If it doesn't work, well hopefully we didn't > >>leave things in an inconsistent state. If we find it does leave > >>things in an inconsistent state, write a cleanup thread to fix bad > >>things in bad states.... > >> > >>Regarding TX specifically. This is a huge topic. I really don't know > >>where to start. I have so many complaints with the data access in > >>ACS. There's what I'd like to see, but its so far from what it really > >>is. Instead I'll address specifically your question. > >> > >>I wish we were doing transaction per API, but I don't think that was > >>ever a consideration. I do think the sync portion of API commands > >>should be wrapped in a single transaction. I really think the > >>original intention of the Transaction framework was to assist in > >>cleaning up resources that people always forget to close. I think > >>that is mostly it. > >> > >>The general guidelines of how I'd like transactions to work would be > >> > >>1) Synchronous portions of API commands are wrapped in a single > >>transaction. Transaction propagation capability from spring tx can > >>then handle nesting transaction as more complicated transaction > >>management may be need in certain places. > >> > >>2) Async jobs that run in a background threads should do small fine > >>grained transaction management. Ideally no transactions. > >>Transactions should not be used as a locking mechanism. > >> > >>Having said that, there are currently so many technical issues in > >>getting to that. For example, with point 1, because IPC/MessageBus > >>and EventBus were added recently, that makes it difficult to do 1. > >>The problem is that you can't send a message while a DB tx is open > >>because the reciever may get the message before the commit. So > >>messaging frameworks have to be written in consideration of the > >>transaction management. Not saying you need to do complex XA style > >>transactions, there's simpler ways to do that. So regarding points 1 > >>and 2 I said. That's what I'd like to see, but I know its a long road > >>to that. > >> > >>Option B is really about introducing an API that will eventually serve > >>as a lightweight wrapper around Spring TX. In the short term, if I do > >>option B, the implementation of the code will still be the custom ACS > >>TX mgmt. So across modules, its sorta kinda works but not really. > >>But if I do the second step of replacing custom ACS TX impl with > >>Spring TX, it will follow how Spring TX works. If we have Sprint TX > >>we can then leverage the transaction propagation features of it to > >>more sanely handle transaction nesting. > >> > >>I feel I went a bit the weeds with that response, but maybe something > >>in there made sense. > >> > >>Darren > >> > >>On Wed, Oct 9, 2013 at 9:31 AM, Pedro Roque Marques > >> wrote: > >>> Darren, > >>> My assumption when I tried to make sense of the transaction code is > >>>that the underlying motivation is that the code is trying to create a > >>>transaction per API call and then allow multiple modules to implement > >>>that API call... > >>> i.e. the intent is do use a bit of what i would call a "web-server > >>>logic"... > >>> > >>> 1. API call starts. > >>> 2. Module X starts transaction... > >>> 3. Module Y does some other changes in the DB... > >>> 4. Either the API call completes successfully or not... commit or error > >>>back to the user. > >>> > >>> I suspect that this was probably the starting point... but it doesn't > >>>really work as i describe above. Often when the plugin i'm working on > >>>screws up (or XenServer is misconfigured) one ends up with DB objects in > >>>inconsistent state. > >>> > >>> I suspect that the DB Transaction design needs to include what is the > >>>methodology for the design of the management server. > >>> > >>> In an ideal world, i would say that API calls just check authorization > >>>and quotas and should store the intent of the management server to reach > >>>the desired state. State machines that can then deal with transient > >>>failures should then attempt to move the state of the system to the > >>>state intended by the user. That however doesn't seem to reflect the > >>>current state of the management server. > >>> > >>> I may be completely wrong... Can you give an example in proposal B of > >>>how a transaction would span multiple modules of code ? > >>> > >>> Pedro. > >>> > >>> On Oct 9, 2013, at 1:44 AM, Darren Shepherd wrote: > >>> > >>>> Okay, please read this all, this is important... I want you all to > >>>> know that its personally important to me to attempt to get rid of ACS > >>>> custom stuff and introduce patterns, frameworks, libraries, etc that I > >>>> feel are more consistent with modern Java development and are > >>>> understood by a wider audience. This is one of the basic reasons I > >>>> started the spring-modularization branch. I just want to be able to > >>>> leverage Spring in a sane way. The current implementation in ACS is > >>>> backwards and broken and abuses Spring to the point that leveraging > >>>> Spring isn't really all that possible. > >>>> > >>>> So while I did the Spring work, I also started laying the ground work > >>>> to get rid of the ACS custom transaction management. The custom DAO > >>>> framework and the corresponding transaction management has been a huge > >>>> barrier to me extending ACS in the past. When you look at how you are > >>>> supposed to access the database, it's all very custom and what I feel > >>>> isn't really all that straight forward. I was debugging an issue > >>>> today and figured out there is a huge bug in what I've done and that > >>>> has lead me down this rabbit hole of what the correct solution is. > >>>> Additionally ACS custom transaction mgmt is done in a way that > >>>> basically breaks Spring too. > >>>> > >>>> At some point on the mailing list there was a small discussion about > >>>> removing the @DB interceptor. The @DB interceptor does txn.open() and > >>>> txn.close() around a method. If a method forgets to commit or > >>>> rollback the txn, txn.close() will rollback the transaction for the > >>>> method. So the general idea of the change was to instead move that > >>>> logic to the bottom of the call stack. The assumption being that the > >>>> @DB code was just an additional check to ensure the programmer didn't > >>>> forget something and we could instead just do that once at the bottom > >>>> of the stack. Oh how wrong I was. > >>>> > >>>> The problem is that developers have relied on the @DB interceptor to > >>>> handle rollback for them. So you see the following code quite a bit > >>>> > >>>> txn.start() > >>>> ... > >>>> txn.commit() > >>>> > >>>> And there is no sign of a rollback anywhere. So the rollback will > >>>> happen if some exception is thrown. By moving the @DB logic to the > >>>> bottom of stack what happens is the transaction is not rolled back > >>>> when the developer thought it would and madness ensues. So that > >>>> change was bad. So what to do.... Here's my totally bias description > >>>> of solutions: > >>>> > >>>> Option A or "Custom Forever!": Go back to custom ACS AOP and the @DB. > >>>> This is what one would think is the simplest and safest solution. > >>>> We'll it ain't really. Here's the killer problem, besides that fact > >>>> that it makes me feel very sad inside, the current rollback behavior > >>>> is broken in certain spots in ACS. While investigating possible > >>>> solutions I started looking at all the places that do programmatic txn > >>>> management. It's important to realize that the txn framework only > >>>> works properly if the method in which you do txn.start() has @DB on > >>>> it. There is a java assert in currentTxn() that attempts to make sure > >>>> that @DB is there. But.... nobody runs with asserts on. So there are > >>>> places in ACS where transactions are started and no @DB is there, but > >>>> it happens to work because some method in the stack has @DB. So to > >>>> properly go back to option A we really need to fix all places that > >>>> don't have @DB, plus make sure people always run with asserts on. And > >>>> then give up making the ACS world a better place and just do things > >>>> how we always have... > >>>> > >>>> Option B or "Progress is Good": The current transaction management > >>>> approach (especially rollback) doesn't match how the majority of > >>>> frameworks out there work. This option is to change the Transaction > >>>> class API to be more consistent with standard frameworks out there. I > >>>> propose the following APIs (if you know Spring TX mgmt, this will look > >>>> familiar) > >>>> > >>>> 1) remove start(), commit(), rollback() - The easiest way to ensure we > >>>> up date everything properly is to break the API and fix everything > >>>> that is broken (about 433 places) > >>>> 2) Add execute(TransactionCallback) where TransactionCallback has one > >>>> method doInTransaction(). For somebody to run a transaction you would > >>>> need to do > >>>> > >>>> txn.execute(new TransactionCallback() { > >>>> Object doInTransaction() { > >>>> // do stuff > >>>> } > >>>> }) > >>>> 3) add "Object startTransaction()," commit(Object), and > >>>> rollback(Object) - These methods are for callers who really really > >>>> want to do thing programmatically. To run a transaction you would do > >>>> > >>>> Object status = txn.startTransaction() > >>>> try { > >>>> //.. do stuff > >>>> txn.commit(status) > >>>> } catch (Exception e) { > >>>> txn.rollback(status) > >>>> } > >>>> > >>>> I'm perfectly willing to go and change all the code for this. It will > >>>> just take a couple hours or so. Option B is purposely almost exactly > >>>> like Spring PlatformTransactionManager. The reason being if we switch > >>>> all the code to this style, we can later drop the implementation of > >>>> Transaction and move to 100% fully Spring TX managed. > >>>> > >>>> Just as a final point, every custom approach or framework we have adds > >>>> a barrier to people extending ACS and additionally puts more burden on > >>>> the ACS community as that is more code we have to support. If > >>>> somebody today wants to know how DB transaction propagation works, > >>>> there's zero documentation on it and probably 2 people who know how it > >>>> works. If we use a standard framework then somebody can just refer to > >>>> that frameworks documentation, community, or stackexchange. > >>>> > >>>> So either option we can do and I'm opening this up to the community to > >>>> decide. If we go with Option A, a small portion of me will die > >>>> inside, but so be it. > >>>> > >>>> Darren > >>> > > > > --20cf303ea4b879b9b504e8537373--