Return-Path: X-Original-To: apmail-cloudstack-dev-archive@www.apache.org Delivered-To: apmail-cloudstack-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5166211CAB for ; Thu, 10 Jul 2014 08:53:20 +0000 (UTC) Received: (qmail 51309 invoked by uid 500); 10 Jul 2014 08:53:19 -0000 Delivered-To: apmail-cloudstack-dev-archive@cloudstack.apache.org Received: (qmail 51264 invoked by uid 500); 10 Jul 2014 08:53:19 -0000 Mailing-List: contact dev-help@cloudstack.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cloudstack.apache.org Delivered-To: mailing list dev@cloudstack.apache.org Received: (qmail 51252 invoked by uid 99); 10 Jul 2014 08:53:19 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Jul 2014 08:53:19 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of FKessen@schubergphilis.com designates 195.66.90.57 as permitted sender) Received: from [195.66.90.57] (HELO sbprmx2.schubergphilis.com) (195.66.90.57) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Jul 2014 08:53:16 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by sbprmx2.schubergphilis.com (Postfix) with ESMTP id 2DBA8128DD for ; Thu, 10 Jul 2014 10:52:52 +0200 (MEST) X-Virus-Scanned: amavisd-new at schubergphilis.com Received: from sbprmx2.schubergphilis.com ([127.0.0.1]) by localhost (sbprmx2.schubergphilis.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9oJZqo1Vperp for ; Thu, 10 Jul 2014 10:52:52 +0200 (MEST) Received: from SBPOTMG101.sbp.lan (edge.schubergphilis.com [195.66.90.11]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by sbprmx2.schubergphilis.com (Postfix) with ESMTP id 1E331128CE for ; Thu, 10 Jul 2014 10:52:52 +0200 (MEST) Received: from SBPOMF101.sbp.lan (10.71.2.130) by SBPOTMG101.sbp.lan (10.71.3.100) with Microsoft SMTP Server (TLS) id 14.3.195.1; Thu, 10 Jul 2014 10:52:51 +0200 Received: from SBPOMB102.sbp.lan ([fe80::b0aa:5d77:2600:dea9]) by SBPOMF101.sbp.lan ([fe80::253f:3a21:d553:7947%15]) with mapi id 14.03.0195.001; Thu, 10 Jul 2014 10:52:51 +0200 From: Funs Kessen To: "dev@cloudstack.apache.org" CC: Daan Hoogland Subject: RE: "Bug" introduced by part of commit a600d8408ea86782318139c17cf346c8497943d0 Thread-Topic: "Bug" introduced by part of commit a600d8408ea86782318139c17cf346c8497943d0 Thread-Index: Ac+cGZCF/C4p3URqRTKV9ekQFwhUzAAAbD+AAAA0XJA= Date: Thu, 10 Jul 2014 08:52:51 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.200.6.71] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org Hi Likitha, Ah cool thanks! Cheers, Funs > -----Original Message----- > From: Likitha Shetty [mailto:likitha.shetty@citrix.com] > Sent: Thursday, July 10, 2014 10:50 AM > To: dev@cloudstack.apache.org > Cc: Daan Hoogland > Subject: RE: "Bug" introduced by part of commit > a600d8408ea86782318139c17cf346c8497943d0 >=20 > Funs, Santhosh fixed this problem yesterday. > https://issues.apache.org/jira/browse/CLOUDSTACK-7079 has more details. >=20 > Thanks, > Likitha >=20 > -----Original Message----- > From: Funs Kessen [mailto:FKessen@schubergphilis.com] > Sent: Thursday, July 10, 2014 2:10 PM > To: dev@cloudstack.apache.org > Cc: Daan Hoogland > Subject: "Bug" introduced by part of commit > a600d8408ea86782318139c17cf346c8497943d0 >=20 > Hi Devs, >=20 > I recently noticed on master that after starting, stopping and starting t= he > management server again I get stacktraces about every second. > After some digging with Daan we found that part of the > a600d8408ea86782318139c17cf346c84979943d0 commit causes this. The > problem is however deeper rooted, as the code is supposed to update the > mshost table, but because another Session ID (actually _runId from > ClusterManagerImpl.java from the previous run) is already in there the > update fails. The Session ID is based on time in milliseconds, and offcou= rse > changes when you stop and start the management server again. > Prior to the commit it failed silently, and has done so since the initial > checkin it seems. The real question is what the original idea is behind t= he > _runId, and is it something that requires fixing ? >=20 > I've pasted part of the commit, sql and stacktrace below. >=20 > Cheers, >=20 > Funs >=20 > =3D=3D=3D > mysql> select * from mshost; > +----+----------------+---------------+---------+-------+----------------= +------------+-------------- > +---------------------+---------+-------------+ > | id | msid | runid | name | state | version = | service_ip | > service_port | last_update | removed | alert_count | > +----+----------------+---------------+---------+-------+----------------= +------------+-------------- > +---------------------+---------+-------------+ > | 1 | 90520734207775 | 1404924979461 | cs-mgmt | Up | 4.5.0- > SNAPSHOT | 127.0.0.1 | 9090 | 2014-07-09 16:58:07 | NULL | = 0 | > +----+----------------+---------------+---------+-------+----------------= +------------+-------------- > +---------------------+---------+-------------+ > 1 row in set (0.00 sec) >=20 > =3D=3D=3D > commit a600d8408ea86782318139c17cf346c8497943d0 > Author: Santhosh Edukulla 2014-07-02 > 10:38:16 > Committer: Santhosh Edukulla 2014-07-04 > 12:47:58 >=20 > Fixed Resource Leaks, null dereferences, few other issues reported by > coverity >=20 > - > framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoI > mpl.java > diff --git > a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa > oImpl.java > b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa > oImpl.java > index 3d0c3f5..89d7d27 100644 > --- > a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa > oImpl.java > +++ > b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa > oImpl.java > @@ -53,15 +53,14 @@ > @Override > public void invalidateRunSession(long id, long runid) { > TransactionLegacy txn =3D TransactionLegacy.currentTxn(); > - PreparedStatement pstmt =3D null; > - try { > - pstmt =3D txn.prepareAutoCloseStatement("update mshost set r= unid=3D0, > state=3D'Down' where id=3D? and runid=3D?"); > - pstmt.setLong(1, id); > - pstmt.setLong(2, runid); > - > - pstmt.executeUpdate(); > + try (PreparedStatement pstmt =3D txn.prepareStatement("update ms= host > set runid=3D0, state=3D'Down' where id=3D? and runid=3D?");){ > + if(pstmt !=3D null) { > + pstmt.setLong(1, id); > + pstmt.setLong(2, runid); > + pstmt.executeUpdate(); > + } > } catch (SQLException e) { > - throw new CloudRuntimeException("DB exception on " + > pstmt.toString(), e); > + throw new > CloudRuntimeException("invalidateRunSession:Exception:"+ e.getMessage(), > e); > } > } >=20 > =3D=3D=3D > 2014-07-09 18:45:06,541 WARN [c.c.c.d.ManagementServerHostDaoImpl] > (Cluster-Heartbeat-1:ctx-5f2f8ad5) update:Exception:Invalid cluster sessi= on > detected > com.cloud.utils.exception.CloudRuntimeException: Invalid cluster session > detected > at > com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management > ServerHostDaoImpl.java:147) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.ja > va:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess > orImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection( > AopUtils.java:317) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoin > point(ReflectiveMethodInvocation.java:183) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R > eflectiveMethodInvocation.java:150) > at > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContex > tInterceptor.java:34) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R > eflectiveMethodInvocation.java:161) > at > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(Ex > poseInvocationInterceptor.java:91) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R > eflectiveMethodInvocation.java:172) > at > org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynam > icAopProxy.java:204) > at com.sun.proxy.$Proxy158.update(Unknown Source) > at > com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl > .java:545) > at > org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(M > anagedContextRunnable.java:49) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call > (DefaultManagedContext.java:56) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callW > ithContext(DefaultManagedContext.java:103) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWi > thContext(DefaultManagedContext.java:53) > at > org.apache.cloudstack.managed.context.ManagedContextRunnable.run(Man > agedContextRunnable.java:46) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:30= 4) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.ac > cess$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.ru > n(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav > a:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja > va:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: com.cloud.cluster.ClusterInvalidSessionException: runid > 1404924278923 is no longer valid > at > com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management > ServerHostDaoImpl.java:147) > ... 26 more > 2014-07-09 18:45:06,543 ERROR [c.c.c.ClusterManagerImpl] (Cluster- > Heartbeat-1:ctx-5f2f8ad5) Unexpected exception in cluster heartbeat > java.lang.RuntimeException: update:Exception:Invalid cluster session > detected > at > com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management > ServerHostDaoImpl.java:155) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.ja > va:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess > orImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection( > AopUtils.java:317) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoin > point(ReflectiveMethodInvocation.java:183) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R > eflectiveMethodInvocation.java:150) > at > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContex > tInterceptor.java:34) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R > eflectiveMethodInvocation.java:161) > at > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(Ex > poseInvocationInterceptor.java:91) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R > eflectiveMethodInvocation.java:172) > at > org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynam > icAopProxy.java:204) > at com.sun.proxy.$Proxy158.update(Unknown Source) > at > com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl > .java:545) > at > org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(M > anagedContextRunnable.java:49) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call > (DefaultManagedContext.java:56) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callW > ithContext(DefaultManagedContext.java:103) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWi > thContext(DefaultManagedContext.java:53) > at > org.apache.cloudstack.managed.context.ManagedContextRunnable.run(Man > agedContextRunnable.java:46) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:30= 4) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.ac > cess$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.ru > n(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav > a:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja > va:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: com.cloud.utils.exception.CloudRuntimeException: Invalid > cluster session detected > at > com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management > ServerHostDaoImpl.java:147) > ... 26 more > Caused by: com.cloud.cluster.ClusterInvalidSessionException: runid > 1404924278923 is no longer valid > at > com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management > ServerHostDaoImpl.java:147) > ... 26 more >=20 >=20