Return-Path: X-Original-To: apmail-cloudstack-dev-archive@www.apache.org Delivered-To: apmail-cloudstack-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7105E1016A for ; Tue, 18 Nov 2014 18:53:54 +0000 (UTC) Received: (qmail 54084 invoked by uid 500); 18 Nov 2014 18:53:53 -0000 Delivered-To: apmail-cloudstack-dev-archive@cloudstack.apache.org Received: (qmail 54032 invoked by uid 500); 18 Nov 2014 18:53:53 -0000 Mailing-List: contact dev-help@cloudstack.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cloudstack.apache.org Delivered-To: mailing list dev@cloudstack.apache.org Received: (qmail 54020 invoked by uid 99); 18 Nov 2014 18:53:53 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Nov 2014 18:53:53 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of min.chen@citrix.com designates 66.165.176.63 as permitted sender) Received: from [66.165.176.63] (HELO SMTP02.CITRIX.COM) (66.165.176.63) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Nov 2014 18:53:48 +0000 X-IronPort-AV: E=Sophos;i="5.07,411,1413244800"; d="scan'208";a="194111089" From: Min Chen To: Leo Simons , Koushik Das , "dev@cloudstack.apache.org" Subject: Re: No event publish can be wrapped within db transaction...why? Thread-Topic: No event publish can be wrapped within db transaction...why? Thread-Index: AQHQAzbuNZCkPoxDc02mMFS8F/gVF5xmupSA Date: Tue, 18 Nov 2014 18:50:23 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.4.3.140616 Content-Type: text/plain; charset="us-ascii" Content-ID: <131DC93893DDE44395B0B9F24A0223EE@citrix.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org Hi Leo, "NO EVENT PUBLISH CAN BE WRAPPED WITHIN DB TRANSACTION!" is along the same line as "NO AGENT COMMAND CAN BE WRAPPED WITHIN DB TRANSACTION!". The rationale behind this is simple: event subscriber execution or agent command handling at resource layer may take too long, and we don't want to have that long transaction window to hold DB for too long. As for your questions about why we bother to use message bus to communicate between two java component, there is a reason for it: loose coupling. IAMApiServiceImpl is a class in IAM plugin service, which can be deployed as a totally different service from CloudStack management server and ideally with future 3rd-party authentication/authorization integration, they may use a totally different database from "cloud" database we are currently using just for simplicity. In this deployment architecture, we have to make sure that this IAM service and CloudStack MS components are loosely coupled. Message bus provided us a very good approach to achieve that. As you said, ideally we would like to achieve a prefect transaction related to account creation in both CloudStack main component and its plugin services, but in reality, this may not work always and big transaction will be error-prone for large scale distributed systems, especially for this loosely coupled components that are crossing different DBs. The plugin architecture in CloudStack is designed to easily enable/disable each plugin component without impacting too much on main CloudStack components. So in this case, I would personally prefer that we should make sure of data integrity in the scope of CloudStack main components first and handle potential message handling failure in plugin module separately through application level logic. Thanks -min On 11/18/14 5:52 AM, "Leo Simons" wrote: >Hi Min, hi Koushik, > >Cloudstack is shouting at me: > NO EVENT PUBLISH CAN BE WRAPPED WITHIN DB TRANSACTION! > >(full stack trace below). I've learned this is happening on our >systemvm-persistent-config feature branch because it has commit >ffaabdc13fde0f0f7b2667a483006e2a4b805f63 but it does not have commit >f585dd266188a134a9c8b911376b066b9d3806e8 yet. > >I'm now trying to understand what's happening here -- the transaction / >concurrency / messaging logic gave me significant headache with its >triple negatives, nested transaction scoping and home-grown gates, but I >think I got it now. > >As I understand it, in the olde world, creating an account: > > * opens a database transaction > * creates an account in the db > * creates the first user in that account in the db > * publishes an event > * which is listened to by 0 subscribers > * commmits the database transaction > * check the user is there > * opens a database transaction > * find the created user in the database > * (auto)closes transaction > * returns success if the user is in the db > >this, err, works, but in some other cases, apparently, there are concerns >that the db transaction is open too long while message handling happens. >So that's why the warning was added, and follow up on, and so now, >creating an account: > > * opens a database transaction > * creates an account in the db > * creates the first user in that account in the db > * commmits the database transaction > * publishes an event > * which is still listened to by on average 0 subscribers, > but there could be an IAM subscriber > * check the user is there > * opens a database transaction > * find the created user in the database > * (auto)closes transaction > * returns success if the user is in the db > >The one possible subscriber for account creation is IAMApiServiceImpl, >which when receiving the event > > * opens a database transaction > * adds the account to acl_group_account_map > * commits the database transaction > * finds the domain for the account > * opens a database transaction > * finds the domain for the account > * (auto)closes transaction > * finds the domain groups for the domain > * opens a database transaction > * finds the domain groups for the domain > * (auto)closes transaction > * for each domain group > * opens a database transaction > * adds the account to acl_group_account_map > * commits the database transaction > >in other words, if there's 1 domain group and an enabled IAM thingie, >this spreads out "make an account" over 6 transactions. Without IAM >thingie its 2 two transactions with a no-op message bus thingie in the >middle. Is that correct? > >If so, I don't understand this at all. The pre-November code doesn't make >that much sense to me (why query the database? If you don't trust your >database its ACID guarantees...why use transactions? Why do we ever need >a message bus between two java components in the same classloader?), but >the new code scares me. > >In the case of errors in between transactions, you can end up with >accounts that are not in all the groups they should be in. I imagine I >would much rather see the whole thing fail, and the complete api call >fail, so that I can re-try it as a whole, than end up with a somehow >half-initialized account. I.e. have everything account-management-y >happen in one transaction which is rolled back on any failure. > >Any thoughts? > > >Thanks! > > >Leo (who can't ask Hugo since Hugo is at apachecon/ccceu and he isn't :)) > > >2014-11-18 13:36:33,145 ERROR [o.a.c.f.m.MessageBusBase] >(qtp1734055321-25:ctx-05df2079 ctx-25ea4461 ctx-3aac3268) > NO EVENT PUBLISH CAN BE WRAPPED WITHIN DB TRANSACTION! > com.cloud.utils.exception.CloudRuntimeException: > NO EVENT PUBLISH CAN BE WRAPPED WITHIN DB TRANSACTION! > at=20 >org.apache.cloudstack.framework.messagebus.MessageBusBase.publish(MessageB >usBase.java:167) > at=20 >com.cloud.user.AccountManagerImpl$2.doInTransaction(AccountManagerImpl.jav >a:1052) > at=20 >com.cloud.user.AccountManagerImpl$2.doInTransaction(AccountManagerImpl.jav >a:1027) > at com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:57) > at com.cloud.utils.db.Transaction.execute(Transaction.java:45) > at com.cloud.utils.db.Transaction.execute(Transaction.java:54) > at=20 >com.cloud.user.AccountManagerImpl.createUserAccount(AccountManagerImpl.jav >a:1027) > at sun.reflect.GeneratedMethodAccessor181.invoke(Unknown Source) > at=20 >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm >pl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at=20 >org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(Ao >pUtils.java:317) > at=20 >org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoi >nt(ReflectiveMethodInvocation.java:183) > at=20 >org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(Refle >ctiveMethodInvocation.java:150) > at=20 >org.apache.cloudstack.network.contrail.management.EventUtils$EventIntercep >tor.invoke(EventUtils.java:106) > at=20 >org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(Refle >ctiveMethodInvocation.java:161) > at=20 >com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java: >51) > at=20 >org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(Refle >ctiveMethodInvocation.java:161) > at=20 >org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(Exp >oseInvocationInterceptor.java:91) > at=20 >org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(Refle >ctiveMethodInvocation.java:172) > at=20 >org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopP >roxy.java:204) > at com.sun.proxy.$Proxy108.createUserAccount(Unknown Source) > at=20 >org.apache.cloudstack.api.command.admin.account.CreateAccountCmd.execute(C >reateAccountCmd.java:178) > at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:141) > at com.cloud.api.ApiServer.queueCommand(ApiServer.java:691) > at com.cloud.api.ApiServer.handleRequest(ApiServer.java:514) > at com.cloud.api.ApiServlet.processRequestInContext(ApiServlet.java:273) > at com.cloud.api.ApiServlet$1.run(ApiServlet.java:117) > at=20 >org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(De >faultManagedContext.java:56) > at=20 >org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithC >ontext(DefaultManagedContext.java:103) > at=20 >org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithCo >ntext(DefaultManagedContext.java:53) > at com.cloud.api.ApiServlet.processRequest(ApiServlet.java:114) > at com.cloud.api.ApiServlet.doPost(ApiServlet.java:81) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)