cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sangeetha Hariharan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CLOUDSTACK-4616) When system Vms fail to start when host is down , link local Ip addresses do not get released resulting in all the link local Ip addresses being consumed eventually.
Date Wed, 08 Jan 2014 19:00:52 GMT

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13865749#comment-13865749
] 

Sangeetha Hariharan commented on CLOUDSTACK-4616:
-------------------------------------------------

Tested with latest build from 4.3

In my set up I have 1 zone – 1 cluster – 1 host with ssvm,cpvm , 1 router and user Vm
running.

I power down the host.

Host continues to be in “UP” state which is as expected.

We would expect the SSVM and CPVM to be marked in “Stopped” state and attempts being made
to start them .
I don’t see this happen..Vms are in “Running” state and Agent State is "UP". 
This is different from the behavior noted in the bug where the SSVM and CPVM actually get
marked as "Stopped" and there is constant effort made to restart SSVM and CPVM.

I do see the  time outs happening in the logs:


2014-01-08 13:05:46,704 WARN  [c.c.a.m.AgentAttache] (AgentTaskPool-5:ctx-71686d7f) Seq 6-1177616388:
Timed out on Seq 6-1
177616388:  { Cmd , MgmtId: 112516401760401, via: 6(v-30-MyTestVM), Ver: v1, Flags: 100011,
[{"com.cloud.agent.api.CheckHe
althCommand":{"wait":50}}] }
2014-01-08 13:06:46,716 WARN  [c.c.a.m.AgentAttache] (AgentTaskPool-9:ctx-8cd341ad) Seq 6-1177616389:
Timed out on Seq 6-1
177616389:  { Cmd , MgmtId: 112516401760401, via: 6(v-30-MyTestVM), Ver: v1, Flags: 100011,
[{"com.cloud.agent.api.CheckHe
althCommand":{"wait":50}}] }
2014-01-08 13:07:46,738 WARN  [c.c.a.m.AgentAttache] (AgentTaskPool-13:ctx-3fc81ec7) Seq 6-1177616390:
Timed out on Seq 6-
1177616390:  { Cmd , MgmtId: 112516401760401, via: 6(v-30-MyTestVM), Ver: v1, Flags: 100011,
[{"com.cloud.agent.api.CheckH
ealthCommand":{"wait":50}}] }
2014-01-08 13:08:46,744 WARN  [c.c.a.m.AgentAttache] (AgentTaskPool-2:ctx-6e0e5ac0) Seq 6-1177616391:
Timed out on Seq 6-1
177616391:  { Cmd , MgmtId: 112516401760401, via: 6(v-30-MyTestVM), Ver: v1, Flags: 100011,
[{"com.cloud.agent.api.CheckHe
althCommand":{"wait":50}}] }
2014-01-08 13:09:46,756 WARN  [c.c.a.m.AgentAttache] (AgentTaskPool-6:ctx-15e6dc40) Seq 6-1177616392:
Timed out on Seq 6-1
177616392:  { Cmd , MgmtId: 112516401760401, via: 6(v-30-MyTestVM), Ver: v1, Flags: 100011,
[{"com.cloud.agent.api.CheckHe
althCommand":{"wait":50}}] }
2014-01-08 13:10:46,772 WARN  [c.c.a.m.AgentAttache] (AgentTaskPool-14:ctx-e3162a4a) Seq 6-1177616393:
Timed out on Seq 6-
1177616393:  { Cmd , MgmtId: 112516401760401, via: 6(v-30-MyTestVM), Ver: v1, Flags: 100011,
[{"com.cloud.agent.api.CheckH
ealthCommand":{"wait":50}}] }
2014-01-08 13:11:46,787 WARN  [c.c.a.m.AgentAttache] (AgentTaskPool-1:ctx-6c539f44) Seq 6-1177616394:
Timed out on Seq 6-1
177616394:  { Cmd , MgmtId: 112516401760401, via: 6(v-30-MyTestVM), Ver: v1, Flags: 100011,
[{"com.cloud.agent.api.CheckHe
althCommand":{"wait":50}}] }
2014-01-08 13:12:46,802 WARN  [c.c.a.m.AgentAttache] (AgentTaskPool-8:ctx-07ad89f9) Seq 6-1177616395:
Timed out on Seq 6-1
177616395:  { Cmd , MgmtId: 112516401760401, via: 6(v-30-MyTestVM), Ver: v1, Flags: 100011,
[{"com.cloud.agent.api.CheckHe
althCommand":{"wait":50}}] }
2014-01-08 13:13:46,818 WARN  [c.c.a.m.AgentAttache] (AgentTaskPool-5:ctx-c2954c2b) Seq 6-1177616396:
Timed out on Seq 6-1
177616396:  { Cmd , MgmtId: 112516401760401, via: 6(v-30-MyTestVM), Ver: v1, Flags: 100011,
[{"com.cloud.agent.api.CheckHe
althCommand":{"wait":50}}] }


> When system Vms fail to start when host is down ,  link local Ip addresses do not get
released resulting in all the link local Ip addresses being consumed eventually.
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-4616
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4616
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Management Server
>    Affects Versions: 4.2.1
>         Environment: Build from 4.2-forward
>            Reporter: Sangeetha Hariharan
>            Assignee: Murali Reddy
>            Priority: Critical
>             Fix For: 4.3.0
>
>         Attachments: hostdown.rar
>
>
> When system Vms fail to start when host is down ,  link local Ip addresses do not get
released resulting in all the link local Ip addresses being consumed eventually.
> Steps to reproduce the problem:
> Advanced zone with 1 cluster having 1 host (Xenserver).
> Had SSVM,CCPVM, 2 routers and few user Vms running in the host.
> power down the host.
> When host was powered down , host is still marked as being in "Up" state . Bug tracked
 in - CLOUDSTACK-2140.
> Attempt to restart all the system Vms in the host that is down is made continuously 
and it fails.
> These failed attempts do not result in releasing the linked local Ip , resulting in all
linked local Ips being consumed.
> When the host is actually powered on , attempts to start the System Vms fail , because
of teh following exception seen in the management-server.logs:
> 013-09-05 12:00:09,551 INFO  [cloud.vm.VirtualMachineManagerImpl] (secstorage-1:null)
Insufficient capacity
> com.cloud.exception.InsufficientAddressCapacityException: Insufficient link local address
capacityScope=interface com.cloud.dc.DataCenter; id=1
>         at com.cloud.network.guru.ControlNetworkGuru.reserve(ControlNetworkGuru.java:156)
>         at com.cloud.network.NetworkManagerImpl.prepareNic(NetworkManagerImpl.java:2157)
>         at com.cloud.network.NetworkManagerImpl.prepare(NetworkManagerImpl.java:2127)
>         at com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:886)
>         at com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:578)
>         at com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:571)
>         at com.cloud.storage.secondary.SecondaryStorageManagerImpl.startSecStorageVm(SecondaryStorageManagerImpl.java:267)
>         at com.cloud.storage.secondary.SecondaryStorageManagerImpl.allocCapacity(SecondaryStorageManagerImpl.java:696)
>         at com.cloud.storage.secondary.SecondaryStorageManagerImpl.expandPool(SecondaryStorageManagerImpl.java:1300)
>         at com.cloud.secstorage.PremiumSecondaryStorageManagerImpl.scanPool(PremiumSecondaryStorageManagerImpl.java:123)
>         at com.cloud.secstorage.PremiumSecondaryStorageManagerImpl.scanPool(PremiumSecondaryStorageManagerImpl.java:50)
>         at com.cloud.vm.SystemVmLoadScanner.loadScan(SystemVmLoadScanner.java:104)
>         at com.cloud.vm.SystemVmLoadScanner.access$100(SystemVmLoadScanner.java:33)
>         at com.cloud.vm.SystemVmLoadScanner$1.reallyRun(SystemVmLoadScanner.java:81)
>         at com.cloud.vm.SystemVmLoadScanner$1.run(SystemVmLoadScanner.java:72)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         at java.lang.Thread.run(Thread.java:679)
> mysql> select * from op_dc_link_local_ip_address_alloc where data_center_id=1 and
taken is null;
> Empty set (0.00 sec)
>   



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message