cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcus Sorensen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CLOUDSTACK-5432) [Automation] Libvtd getting crashed and agent going to alert start
Date Tue, 07 Jan 2014 05:54:53 GMT

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13863929#comment-13863929
] 

Marcus Sorensen commented on CLOUDSTACK-5432:
---------------------------------------------

I can tell that the previous issue is *NOT* happening via the below section of the log. Previously,
when it hit 'Timed out waiting for domain to shut down gracefully' (this looks like a non-ACPI
template if it's not shutting down), it would try to disconnect the the cdrom AND disks, but
now it simply moves on to cleaning up the nics and returns success on StopCommand. In fact,
I can't really tell from these logs that *anything* went wrong, aside from losing connection
to the mgmt server. Not being able to connect to 10.223.49.195 port 8250 is the first sign
of anything wrong.

2014-01-06 02:58:45,409 WARN  [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-1:null)
Timed out waiting for domain i-177-269-QA to shutdown gracefully
2014-01-06 02:58:45,885 DEBUG [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-1:null)
Unable to clean up disk with null path (perhaps empty cdrom drive):<disk  device='cdrom'
type='file'>
<driver name='qemu' type='raw' cache='none' />
<source file=''/>
<target dev='hdc' bus='ide'/>
</disk>

2014-01-06 02:58:45,886 DEBUG [utils.script.Script] (agentRequest-Handler-1:null) Executing:
/bin/bash -c ls /sys/class/net/brem1-2322 
2014-01-06 02:58:45,919 DEBUG [utils.script.Script] (agentRequest-Handler-1:null) Execution
is successful.
2014-01-06 02:58:45,920 DEBUG [utils.script.Script] (agentRequest-Handler-1:null) Executing:
/bin/bash -c ls /sys/class/net/brem1-2322/brif | tr '
' ' ' 
2014-01-06 02:58:45,935 DEBUG [utils.script.Script] (agentRequest-Handler-1:null) Execution
is successful.
2014-01-06 02:58:45,935 DEBUG [kvm.resource.BridgeVifDriver] (agentRequest-Handler-1:null)
Executing: /usr/share/cloudstack-common/scripts/vm/network/vnet/modifyvlan.sh -o delete -v
2322 -p em1 -b brem1-2322 
2014-01-06 02:58:46,331 DEBUG [kvm.resource.BridgeVifDriver] (agentRequest-Handler-1:null)
Execution is successful.
2014-01-06 02:58:46,332 DEBUG [cloud.agent.Agent] (agentRequest-Handler-1:null) Seq 2-812254404:
 { Ans: , MgmtId: 29066118877352, via: 2, Ver: v1, Flags: 10, [{"com.cloud.agent.api.StopAnswer":{"result":true,"wait":0}}]
}


> [Automation] Libvtd getting crashed and agent going to alert start 
> -------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-5432
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5432
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: KVM
>    Affects Versions: 4.3.0
>         Environment: KVM (RHEL 6.3)
> Branch : 4.3
>            Reporter: Rayees Namathponnan
>            Assignee: Marcus Sorensen
>            Priority: Blocker
>             Fix For: 4.3.0
>
>         Attachments: CLOUDSTACK-5432_Jan_06.rar, KVM_Automation_Dec_11.rar, agent1.rar,
agent2.rar, management-server.rar
>
>
> This issue is observed in  4.3 automation environment;  libvirt crashed and cloudstack
agent went to alert start;
> Please see the agent log; connection between agent and MS lost with error "Connection
closed with -1 on reading size."  @ 2013-12-09 19:47:06,969
> 2013-12-09 19:43:41,495 DEBUG [cloud.agent.Agent] (agentRequest-Handler-2:null) Processing
command: com.cloud.agent.api.GetStorageStatsCommand
> 2013-12-09 19:47:06,969 DEBUG [utils.nio.NioConnection] (Agent-Selector:null) Location
1: Socket Socket[addr=/10.223.49.195,port=8250,localport=40801] closed on read.  Probably
-1 returned: Connection closed with -1 on reading size.
> 2013-12-09 19:47:06,969 DEBUG [utils.nio.NioConnection] (Agent-Selector:null) Closing
socket Socket[addr=/10.223.49.195,port=8250,localport=40801]
> 2013-12-09 19:47:06,969 DEBUG [cloud.agent.Agent] (Agent-Handler-3:null) Clearing watch
list: 2
> 2013-12-09 19:47:11,969 INFO  [cloud.agent.Agent] (Agent-Handler-3:null) Lost connection
to the server. Dealing with the remaining commands...
> 2013-12-09 19:47:11,970 INFO  [cloud.agent.Agent] (Agent-Handler-3:null) Cannot connect
because we still have 5 commands in progress.
> 2013-12-09 19:47:16,970 INFO  [cloud.agent.Agent] (Agent-Handler-3:null) Lost connection
to the server. Dealing with the remaining commands...
> 2013-12-09 19:47:16,990 INFO  [cloud.agent.Agent] (Agent-Handler-3:null) Cannot connect
because we still have 5 commands in progress.
> 2013-12-09 19:47:21,990 INFO  [cloud.agent.Agent] (Agent-Handler-3:null) Lost connection
to the server. Dealing with the remaining commands.. 
> Please see the lib virtd log at same time (please see the attached complete log, there
is a 5 hour  difference in agent log and libvirt log ) 
> 2013-12-10 02:45:45.563+0000: 5938: error : qemuMonitorIO:574 : internal error End of
file from monitor
> 2013-12-10 02:45:47.663+0000: 5942: error : virCommandWait:2308 : internal error Child
process (/bin/umount /mnt/41b632b5-40b3-3024-a38b-ea259c72579f) status unexpected: exit status
16
> 2013-12-10 02:45:53.925+0000: 5943: error : virCommandWait:2308 : internal error Child
process (/sbin/tc qdisc del dev vnet14 root) status unexpected: exit status 2
> 2013-12-10 02:45:53.929+0000: 5943: error : virCommandWait:2308 : internal error Child
process (/sbin/tc qdisc del dev vnet14 ingress) status unexpected: exit status 2
> 2013-12-10 02:45:54.011+0000: 5943: warning : qemuDomainObjTaint:1297 : Domain id=71
name='i-45-97-QA' uuid=7717ba08-be84-4b63-a674-1534f9dc7bef is tainted: high-privileges
> 2013-12-10 02:46:33.070+0000: 5940: error : virCommandWait:2308 : internal error Child
process (/sbin/tc qdisc del dev vnet12 root) status unexpected: exit status 2
> 2013-12-10 02:46:33.081+0000: 5940: error : virCommandWait:2308 : internal error Child
process (/sbin/tc qdisc del dev vnet12 ingress) status unexpected: exit status 2
> 2013-12-10 02:46:33.197+0000: 5940: warning : qemuDomainObjTaint:1297 : Domain id=72
name='i-47-111-QA' uuid=7fcce58a-96dc-4207-9998-b8fb72b446ac is tainted: high-privileges
> 2013-12-10 02:46:36.394+0000: 5938: error : qemuMonitorIO:574 : internal error End of
file from monitor
> 2013-12-10 02:46:37.685+0000: 5940: error : virCommandWait:2308 : internal error Child
process (/bin/umount /mnt/41b632b5-40b3-3024-a38b-ea259c72579f) status unexpected: exit status
16
> 2013-12-10 02:46:57.869+0000: 5940: error : virCommandWait:2308 : internal error Child
process (/sbin/tc qdisc del dev vnet15 root) status unexpected: exit status 2
> 2013-12-10 02:46:57.873+0000: 5940: error : virCommandWait:2308 : internal error Child
process (/sbin/tc qdisc del dev vnet15 ingress) status unexpected: exit status 2
> 2013-12-10 02:46:57.925+0000: 5940: error : virCommandWait:2308 : internal error Child
process (/sbin/tc qdisc del dev vnet17 root) status unexpected: exit status 2
> 2013-12-10 02:46:57.933+0000: 5940: error : virCommandWait:2308 : internal error Child
process (/sbin/tc qdisc del dev vnet17 ingress) status unexpected: exit status 2
> 2013-12-10 02:46:58.034+0000: 5940: warning : qemuDomainObjTaint:1297 : Domain id=73
name='r-114-QA' uuid=8ded6f1b-69e7-419d-8396-5795372d0ae2 is tainted: high-privileges
> 2013-12-10 02:47:22.762+0000: 5938: error : qemuMonitorIO:574 : internal error End of
file from monitor
> 2013-12-10 02:47:23.273+0000: 5939: error : virCommandWait:2308 : internal error Child
process (/bin/umount /mnt/41b632b5-40b3-3024-a38b-ea259c72579f) status unexpected: exit status
16
> virsh command doest not return anything and hung;
> [root@Rack2Host11 libvirt]# virsh list
> Work around
> If i restart libvirtd,  agent can connect MS 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message