cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcus Sorensen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CLOUDSTACK-5430) KVM - Primary store down - Not abel to start Vms/take snapshots after the primary store is brought down and brough back up again.
Date Wed, 08 Jan 2014 16:17:51 GMT

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13865591#comment-13865591
] 

Marcus Sorensen commented on CLOUDSTACK-5430:
---------------------------------------------

I'll take a look at the NPE.  I should note that I think the steps to reproduce this will
guarantee a forced reboot of the KVM host anyway.  Generally if I NFS mount something, make
I/O intensive processes dependent on the mount, and then lose contact with the NFS server
it guarantees that the host will never recover. You can lazy unmount, but the processes that
were using the mount are usually stuck in D state indefinitely.

> KVM - Primary store down - Not abel to start Vms/take snapshots after the primary store
is brought down and brough back up again.
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-5430
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5430
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Management Server
>    Affects Versions: 4.3.0
>         Environment: Build from 4.3
>            Reporter: Sangeetha Hariharan
>            Assignee: Marcus Sorensen
>            Priority: Critical
>             Fix For: 4.3.0
>
>         Attachments: psdown.rar
>
>
> KVM - Primary store down - Not abel to start Vms/take snapshots after the primary store
is brought down and brough back up again.
> Set up:
> Advanced zone with KVM (RHEL 6.3) hosts.
> Steps to reproduce the problem:
> 1. Deploy few Vms in each of the hosts with 10 GB ROOT volume size , so we start with
10 Vms.
> 2. Create snaposhot for ROOT volumes.
> 3. When snapshot is still in progress , Make the primary storage unavailable for 10 mts.
> This results in the KVM hosts to reboot.
> But reboot of KVM host is not successful.It is stuck at trying to unmount nfs mount points.
This is tracked in CLOUDSTACK-5429.
> Stop and start KM hosts manually to workaround this problem.
> At this point all the Vms are marked as "Stopped" state in CloudStack.
> 4. Now make the primary store available.
> 5. Attempt to start the VM.
> It fails to start with the following exception:
> 2013-12-09 20:35:55,891 DEBUG [c.c.a.t.Request] (AgentManager-Handler-2:null) Seq 2-1983250480:
Processing:  { Ans
> : , MgmtId: 82324189320212, via: 2, Ver: v1, Flags: 10, [{"com.cloud.agent.api.Answer":{"result":false,"details":"
> java.lang.NullPointerException\n\tat com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.execute(LibvirtCom
> putingResource.java:2488)\n\tat com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtC
> omputingResource.java:1260)\n\tat com.cloud.agent.Agent.processRequest(Agent.java:498)\n\tat
com.cloud.agent.Agent
> $AgentRequestHandler.doTask(Agent.java:806)\n\tat com.cloud.utils.nio.Task.run(Task.java:83)\n\tat
java.util.concu
> rrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)\n\tat java.util.concurrent.ThreadPoolExecutor$Wor
> ker.run(ThreadPoolExecutor.java:603)\n\tat java.lang.Thread.run(Thread.java:679)\n","wait":0}}]
}
> 2013-12-09 20:35:55,891 DEBUG [c.c.a.t.Request] (StatsCollector-3:ctx-f0d35c47) Seq 2-1983250480:
Received:  { Ans
> : , MgmtId: 82324189320212, via: 2, Ver: v1, Flags: 10, { Answer } }
> 2013-12-09 20:35:56,939 DEBUG [c.c.a.ApiServlet] (catalina-exec-13:ctx-35adede4) ===START===
 10.216.50.147 -- GET
>   command=queryAsyncJobResult&jobId=489806e9-96f9-4940-9ea0-6bd9516aabb0&response=json&sessionkey=qRSeXYRCfc1PSAXc
> omRT8ue1f%2BE%3D&_=1386639381768
> 2013-12-09 20:35:56,953 DEBUG [c.c.a.ApiServlet] (catalina-exec-13:ctx-35adede4 ctx-065180b8)
===END===  10.216.50
> .147 -- GET  command=queryAsyncJobResult&jobId=489806e9-96f9-4940-9ea0-6bd9516aabb0&response=json&sessionkey=qRSeX
> YRCfc1PSAXcomRT8ue1f%2BE%3D&_=1386639381768
> 2013-12-09 20:35:59,322 DEBUG [c.c.a.t.Request] (AgentManager-Handler-14:null) Seq 1-539557989:
Processing:  { Ans
> : , MgmtId: 82324189320212, via: 1, Ver: v1, Flags: 10, [{"com.cloud.agent.api.Answer":{"result":false,"details":"
> java.lang.NullPointerException\n\tat com.cloud.hypervisor.kvm.storage.KVMStoragePoolManager.disconnectPhysicalDisk
> sViaVmSpec(KVMStoragePoolManager.java:181)\n\tat com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.execut
> e(LibvirtComputingResource.java:3672)\n\tat com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1282)\n\tat
com.cloud.agent.Agent.processRequest(Agent.java:498)\n\tat com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:806)\n\tat
com.cloud.utils.nio.Task.run(Task.java:83)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)\n\tat
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)\n\tat java.lang.Thread.run(Thread.java:679)\n","wait":0}},{"com.cloud.agent.api.Answer":{"result":false,"details":"Stopped
by previous failure","wait":0}},{"com.cloud.agent.api.Answer":{"result":false,"details":"Stopped
by previous failure","wait":0}},{"com.cloud.agent.api.Answer":{"result":false,"details":"Stopped
by previous failure","wait":0}},{"com.cloud.agent.api.Answer":{"result":false,"details":"Stopped
by previous failure","wait":0}},{"com.cloud.agent.api.Answer":{"result":false,"details":"Stopped
by previous failure","wait":0}}] }
> 2013-12-09 20:35:59,322 DEBUG [c.c.a.t.Request] (Job-Executor-26:ctx-0382e21d ctx-d8f9d323)
Seq 1-539557989: Received:  { Ans: , MgmtId: 82324189320212, via: 1, Ver: v1, Flags: 10, {
Answer, Answer, Answer, Answer, Answer, Answer } }
> 6. Attempting to take snapshots also fails with following exception:
> 2013-12-09 20:54:10,509 DEBUG [c.c.a.t.Request] (AgentManager-Handler-10:null) Seq 2-1983250525:
Processing:  { An
> s: , MgmtId: 82324189320212, via: 2, Ver: v1, Flags: 10, [{"org.apache.cloudstack.storage.command.CreateObjectAnsw
> er":{"result":false,"details":"com.cloud.utils.exception.CloudRuntimeException: java.lang.NullPointerException","w
> ait":0}}] }
> 2013-12-09 20:54:10,509 DEBUG [c.c.a.t.Request] (Job-Executor-34:ctx-eb237191 ctx-20bb478f)
Seq 2-1983250525: Rece
> ived:  { Ans: , MgmtId: 82324189320212, via: 2, Ver: v1, Flags: 10, { CreateObjectAnswer
} }
> 2013-12-09 20:54:10,509 DEBUG [o.a.c.s.s.SnapshotServiceImpl] (Job-Executor-34:ctx-eb237191
ctx-20bb478f) create s
> napshot TestVM-tiny-host-0ps-0-4_ROOT-49_20131210014410 failed: com.cloud.utils.exception.CloudRuntimeException:
j
> ava.lang.NullPointerException
> 2013-12-09 20:54:10,519 DEBUG [o.a.c.s.s.XenserverSnapshotStrategy] (Job-Executor-34:ctx-eb237191
ctx-20bb478f) Fa
> iled to take snapshot: com.cloud.utils.exception.CloudRuntimeException: java.lang.NullPointerException
> 2013-12-09 20:54:10,536 DEBUG [c.c.s.s.SnapshotManagerImpl] (Job-Executor-34:ctx-eb237191
ctx-20bb478f) Failed to
> create snapshot
> com.cloud.utils.exception.CloudRuntimeException: com.cloud.utils.exception.CloudRuntimeException:
java.lang.NullPo
> interException
>         at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.takeSnapshot(XenserverSnapshotStrategy
> .java:281)
>         at com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot(SnapshotManagerImpl.java:951)
>         at sun.reflect.GeneratedMethodAccessor230.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:601)
>         at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation
> .java:183)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:15
> 0)
>         at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java
> :91)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:17
> 2)
>         at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
>         at $Proxy161.takeSnapshot(Unknown Source)
>         at org.apache.cloudstack.storage.volume.VolumeServiceImpl.takeSnapshot(VolumeServiceImpl.java:1341)
>         at com.cloud.storage.VolumeApiServiceImpl.takeSnapshot(VolumeApiServiceImpl.java:1461)
>         at sun.reflect.GeneratedMethodAccessor229.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:601)
>         at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
>         at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
>         at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
>         at $Proxy233.takeSnapshot(Unknown Source)
>         at org.apache.cloudstack.api.command.user.snapshot.CreateSnapshotCmd.execute(CreateSnapshotCmd.java:181)
>         at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:161)
>         at com.cloud.api.ApiAsyncJobDispatcher.runJobInContext(ApiAsyncJobDispatcher.java:109)
>         at com.cloud.api.ApiAsyncJobDispatcher$1.run(ApiAsyncJobDispatcher.java:66)
>         at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>         at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>         at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>         at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:63)
>         at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:520)
>         at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>         at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>         at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>         at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>         at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         at java.lang.Thread.run(Thread.java:722)
> 2013-12-09 20:54:10,544 DEBUG [o.a.c.s.v.VolumeServiceImpl] (Job-Executor-34:ctx-eb237191
ctx-20bb478f) Take snapshot: 49 failed
> com.cloud.utils.exception.CloudRuntimeException: Failed to create snapshot



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message