cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Funs Kessen <FKes...@schubergphilis.com>
Subject RE: "Bug" introduced by part of commit a600d8408ea86782318139c17cf346c8497943d0
Date Thu, 10 Jul 2014 11:40:20 GMT
Hi Santhosh,

You're right indeed, it's the update statement in the update itself that adds the runid in
the where clause, it came in three years and three months ago by the looks of it.

Cheers,

Funs

> -----Original Message-----
> From: Santhosh Edukulla [mailto:santhosh.edukulla@citrix.com]
> Sent: Thursday, July 10, 2014 1:22 PM
> To: dev@cloudstack.apache.org
> Subject: RE: "Bug" introduced by part of commit
> a600d8408ea86782318139c17cf346c8497943d0
> 
> The mentioned issue by Funs was existing earlier as well but was hidden by
> just logging it when update was failed, now with new change, we resurfaced
> the problem by throwing it. Its not because of close though. We can see to fix
> the original problem.
> 
> Santhosh
> ________________________________________
> From: Likitha Shetty [likitha.shetty@citrix.com]
> Sent: Thursday, July 10, 2014 7:18 AM
> To: dev@cloudstack.apache.org
> Subject: RE: "Bug" introduced by part of commit
> a600d8408ea86782318139c17cf346c8497943d0
> 
> Santhosh, I pulled the latest code and unfortunately the issue still exists.
> 
> Looks like the problem is because we try to close the transaction after it has
> been committed (which implicitly closes the transaction) -
> /framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDao
> Impl.java
>             txn.commit();
>         } catch (RuntimeException e) {
>             txn.rollback();
>             s_logger.warn("Unexpected exception, ", e);
>             throw new RuntimeException(e.getMessage(), e);
>         }finally {
>             txn.close();
>         }
> 
> Thanks,
> Likitha
> 
> -----Original Message-----
> From: Funs Kessen [mailto:FKessen@schubergphilis.com]
> Sent: Thursday, July 10, 2014 2:38 PM
> To: dev@cloudstack.apache.org
> Subject: RE: "Bug" introduced by part of commit
> a600d8408ea86782318139c17cf346c8497943d0
> 
> Yeah I see that Ipasted the wrong bit of code *doh*, it was update that was
> the problem from the same file:
> 
> @@ -129,24 +133,29 @@
>      @DB
>      public void update(long id, long runid, Date lastUpdate) {
>          TransactionLegacy txn = TransactionLegacy.currentTxn();
> -        PreparedStatement pstmt = null;
>          try {
>              txn.start();
> +            try( PreparedStatement pstmt = txn.prepareStatement("update mshost
> set last_update=?, removed=null, alert_count=0 where id=? and runid=?");) {
> +                pstmt.setString(1,
> DateUtil.getDateDisplayString(TimeZone.getTimeZone("GMT"), lastUpdate));
> +                pstmt.setLong(2, id);
> +                pstmt.setLong(3, runid);
> 
> -            pstmt = txn.prepareAutoCloseStatement("update mshost set
> last_update=?, removed=null, alert_count=0 where id=? and runid=?");
> -            pstmt.setString(1,
> DateUtil.getDateDisplayString(TimeZone.getTimeZone("GMT"), lastUpdate));
> -            pstmt.setLong(2, id);
> -            pstmt.setLong(3, runid);
> +                int count = pstmt.executeUpdate();
> +                txn.commit();
> 
> -            int count = pstmt.executeUpdate();
> -            txn.commit();
> -
> -            if (count < 1) {
> -                throw new CloudRuntimeException("Invalid cluster session
> detected", new ClusterInvalidSessionException("runid " + runid + " is no
> longer valid"));
> +                if (count < 1) {
> +                    throw new CloudRuntimeException("Invalid cluster session
> detected", new ClusterInvalidSessionException("runid " + runid + " is no
> longer valid"));
> +                }
> +            }catch (SQLException e) {
> +                throw new
> + CloudRuntimeException("update:Exception:"+e.getMessage(), e);
>              }
> -        } catch (Exception e) {
> -            s_logger.warn("Unexpected exception, ", e);
> -            throw new RuntimeException(e.getMessage(), e);
> +        } catch (RuntimeException e) {
> +            txn.rollback();
> +            s_logger.warn("update:Exception:"+e.getMessage(), e);
> +            throw new RuntimeException("update:Exception:"+e.getMessage(), e);
> +        }
> +        finally {
> +            txn.close();
>          }
>      }
> 
> 
> > -----Original Message-----
> > From: Trippie [mailto:trippie@gmail.com] On Behalf Of Hugo Trippaers
> > Sent: Thursday, July 10, 2014 11:00 AM
> > To: dev@cloudstack.apache.org
> > Subject: Re: "Bug" introduced by part of commit
> > a600d8408ea86782318139c17cf346c8497943d0
> >
> > Fixed already, good job Santosh
> >
> > Cheers,
> >
> > Hugo
> >
> > On 10 jul. 2014, at 10:58, Hugo Trippaers <trippie@gmail.com> wrote:
> >
> > > Doesn't seem related to that particular commit, the function
> > invalidateRunSession appears to be dead code.
> > >
> > > I'll do some digging.
> > >
> > >
> > > Cheers,
> > >
> > > Hugo
> > >
> > > On 10 jul. 2014, at 10:40, Funs Kessen <FKessen@schubergphilis.com>
> > wrote:
> > >
> > >> Hi Devs,
> > >>
> > >> I recently noticed on master that after starting, stopping and
> > >> starting the
> > management server again I get stacktraces about every second.
> > >> After some digging with Daan we found that part of the
> > a600d8408ea86782318139c17cf346c84979943d0 commit causes this.
> The
> > problem is however deeper rooted, as the code is supposed to update
> > the mshost table, but because another Session ID (actually _runId from
> > ClusterManagerImpl.java from the previous run)  is already in there
> > the update fails. The Session ID is based on time in milliseconds, and
> > offcourse changes when you stop and start the management server again.
> > >> Prior to the commit it failed silently, and has done so since the
> > >> initial
> > checkin it seems. The real question is what the original idea is
> > behind the _runId, and is it something that requires fixing ?
> > >>
> > >> I've pasted part of the commit, sql and stacktrace below.
> > >>
> > >> Cheers,
> > >>
> > >> Funs
> > >>
> > >> ===
> > >> mysql> select * from mshost;
> > >> +----+----------------+---------------+---------+-------+----------------+------------+--------
> --
> > ----+---------------------+---------+-------------+
> > >> | id | msid           | runid         | name    | state | version     
  | service_ip |
> > service_port | last_update         | removed | alert_count |
> > >> +----+----------------+---------------+---------+-------+----------------+------------+--------
> --
> > ----+---------------------+---------+-------------+
> > >> |  1 | 90520734207775 | 1404924979461 | cs-mgmt | Up    | 4.5.0-
> > SNAPSHOT | 127.0.0.1  |         9090 | 2014-07-09 16:58:07 | NULL    |         
 0
> |
> > >> +----+----------------+---------------+---------+-------+----------------+------------+--------
> --
> > ----+---------------------+---------+-------------+
> > >> 1 row in set (0.00 sec)
> > >>
> > >> ===
> > >> commit a600d8408ea86782318139c17cf346c8497943d0
> > >> Author: Santhosh Edukulla <santhosh.edukulla@gmail.com> 2014-07-
> 02
> > >> 10:38:16
> > >> Committer: Santhosh Edukulla <santhosh.edukulla@gmail.com> 2014-
> 07-
> > 04
> > >> 12:47:58
> > >>
> > >> Fixed Resource Leaks, null dereferences, few other issues reported
> > >> by coverity
> > >>
> > >> -
> > >>
> >
> framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoI
> > m
> > >> pl.java diff --git
> > >>
> >
> a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> > o
> > >> Impl.java
> > >>
> >
> b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> > o
> > >> Impl.java
> > >> index 3d0c3f5..89d7d27 100644
> > >> ---
> > >>
> >
> a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> > o
> > >> Impl.java
> > >> +++
> > b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHos
> > >> +++ tDaoImpl.java
> > >> @@ -53,15 +53,14 @@
> > >>    @Override
> > >>    public void invalidateRunSession(long id, long runid) {
> > >>        TransactionLegacy txn = TransactionLegacy.currentTxn();
> > >> -        PreparedStatement pstmt = null;
> > >> -        try {
> > >> -            pstmt = txn.prepareAutoCloseStatement("update mshost set
> > runid=0, state='Down' where id=? and runid=?");
> > >> -            pstmt.setLong(1, id);
> > >> -            pstmt.setLong(2, runid);
> > >> -
> > >> -            pstmt.executeUpdate();
> > >> +        try (PreparedStatement pstmt =
> > >> + txn.prepareStatement("update
> > mshost set runid=0, state='Down' where id=? and runid=?");){
> > >> +            if(pstmt != null) {
> > >> +                pstmt.setLong(1, id);
> > >> +                pstmt.setLong(2, runid);
> > >> +                pstmt.executeUpdate();
> > >> +            }
> > >>        } catch (SQLException e) {
> > >> -            throw new CloudRuntimeException("DB exception on " +
> > pstmt.toString(), e);
> > >> +            throw new
> > >> + CloudRuntimeException("invalidateRunSession:Exception:"+
> > >> + e.getMessage(), e);
> > >>        }
> > >>    }
> > >>
> > >> ===
> > >> 2014-07-09 18:45:06,541 WARN
> [c.c.c.d.ManagementServerHostDaoImpl]
> > >> (Cluster-Heartbeat-1:ctx-5f2f8ad5) update:Exception:Invalid cluster
> > >> session detected
> > >> com.cloud.utils.exception.CloudRuntimeException: Invalid cluster
> > >> session
> > detected
> > >>       at
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > ServerHostDaoImpl.java:147)
> > >>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >>       at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> > a
> > va:57)
> > >>       at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> > orImpl.java:43)
> > >>       at java.lang.reflect.Method.invoke(Method.java:606)
> > >>       at
> > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflectio
> > n(
> > AopUtils.java:317)
> > >>       at
> > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoi
> > n
> > point(ReflectiveMethodInvocation.java:183)
> > >>       at
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > eflectiveMethodInvocation.java:150)
> > >>       at
> > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionCon
> > tex
> > tInterceptor.java:34)
> > >>       at
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > eflectiveMethodInvocation.java:161)
> > >>       at
> > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke
> > (Ex
> > poseInvocationInterceptor.java:91)
> > >>       at
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > eflectiveMethodInvocation.java:172)
> > >>       at
> >
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynam
> > icAopProxy.java:204)
> > >>       at com.sun.proxy.$Proxy158.update(Unknown Source)
> > >>       at
> >
> com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl
> > .java:545)
> > >>       at
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(M
> > anagedContextRunnable.java:49)
> > >>       at
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.cal
> > l
> > (DefaultManagedContext.java:56)
> > >>       at
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callW
> > ithContext(DefaultManagedContext.java:103)
> > >>       at
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWi
> > thContext(DefaultManagedContext.java:53)
> > >>       at
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(Man
> > agedContextRunnable.java:46)
> > >>       at
> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471
> > )
> > >>       at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> > >>       at
> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.a
> > c
> > cess$301(ScheduledThreadPoolExecutor.java:178)
> > >>       at
> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.r
> > u
> > n(ScheduledThreadPoolExecutor.java:293)
> > >>       at
> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> > av
> > a:1145)
> > >>       at
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> > ja
> > va:615)
> > >>       at java.lang.Thread.run(Thread.java:744)
> > >> Caused by: com.cloud.cluster.ClusterInvalidSessionException: runid
> > 1404924278923 is no longer valid
> > >>       at
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > ServerHostDaoImpl.java:147)
> > >>       ... 26 more
> > >> 2014-07-09 18:45:06,543 ERROR [c.c.c.ClusterManagerImpl]
> > >> (Cluster-Heartbeat-1:ctx-5f2f8ad5) Unexpected exception in cluster
> > >> heartbeat
> > >> java.lang.RuntimeException: update:Exception:Invalid cluster
> > >> session
> > detected
> > >>       at
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > ServerHostDaoImpl.java:155)
> > >>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >>       at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> > a
> > va:57)
> > >>       at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> > orImpl.java:43)
> > >>       at java.lang.reflect.Method.invoke(Method.java:606)
> > >>       at
> > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflectio
> > n(
> > AopUtils.java:317)
> > >>       at
> > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoi
> > n
> > point(ReflectiveMethodInvocation.java:183)
> > >>       at
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > eflectiveMethodInvocation.java:150)
> > >>       at
> > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionCon
> > tex
> > tInterceptor.java:34)
> > >>       at
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > eflectiveMethodInvocation.java:161)
> > >>       at
> > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke
> > (Ex
> > poseInvocationInterceptor.java:91)
> > >>       at
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > eflectiveMethodInvocation.java:172)
> > >>       at
> >
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynam
> > icAopProxy.java:204)
> > >>       at com.sun.proxy.$Proxy158.update(Unknown Source)
> > >>       at
> >
> com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl
> > .java:545)
> > >>       at
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(M
> > anagedContextRunnable.java:49)
> > >>       at
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.cal
> > l
> > (DefaultManagedContext.java:56)
> > >>       at
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callW
> > ithContext(DefaultManagedContext.java:103)
> > >>       at
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWi
> > thContext(DefaultManagedContext.java:53)
> > >>       at
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(Man
> > agedContextRunnable.java:46)
> > >>       at
> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471
> > )
> > >>       at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> > >>       at
> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.a
> > c
> > cess$301(ScheduledThreadPoolExecutor.java:178)
> > >>       at
> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.r
> > u
> > n(ScheduledThreadPoolExecutor.java:293)
> > >>       at
> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> > av
> > a:1145)
> > >>       at
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> > ja
> > va:615)
> > >>       at java.lang.Thread.run(Thread.java:744)
> > >> Caused by: com.cloud.utils.exception.CloudRuntimeException: Invalid
> > cluster session detected
> > >>       at
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > ServerHostDaoImpl.java:147)
> > >>       ... 26 more
> > >> Caused by: com.cloud.cluster.ClusterInvalidSessionException: runid
> > 1404924278923 is no longer valid
> > >>       at
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > ServerHostDaoImpl.java:147)
> > >>       ... 26 more
> > >>
> > >>
> > >>
> > >


Mime
View raw message