cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Indra Pramana <in...@sg.or.id>
Subject Re: Major stability problems lately
Date Fri, 23 May 2014 03:44:03 GMT
Hi Ilya and Chiradeep,

Good day to you, and thank you for your replies.

If it's kernel bug, can you advise how to resolve the problem? Do you think
upgrading to the latest kernel will help to resolve the problem? We are
using Ubuntu 12.04.3 LTS (GNU/Linux 3.5.0-23-generic x86_64).

Looking forward to your reply, thank you.

Cheers.




On Fri, May 23, 2014 at 9:56 AM, ilya musayev
<ilya.mailing.lists@gmail.com>wrote:

> This issue is kernel/hardware related, i've seen this in past on non
> cloudstack and non kvm linux hosts.
>
>
> On 5/22/14, 3:35 PM, Chiradeep Vittal wrote:
>
>> Sounds like a kernel bug.
>>
>> From: Indra Pramana <indra@sg.or.id<mailto:indra@sg.or.id>>
>> Reply-To: "users@cloudstack.apache.org<mailto:users@cloudstack.apache.org>"
>> <users@cloudstack.apache.org<mailto:users@cloudstack.apache.org>>
>> Date: Thursday, May 22, 2014 at 2:59 AM
>> To: "users@cloudstack.apache.org<mailto:users@cloudstack.apache.org>" <
>> users@cloudstack.apache.org<mailto:users@cloudstack.apache.org>>
>> Cc: "ehlerst@gmail.com<mailto:ehlerst@gmail.com>" <ehlerst@gmail.com
>> <mailto:ehlerst@gmail.com>>
>> Subject: Re: Major stability problems lately
>>
>> Hi Timothy and all,
>>
>> Apologise for replying to an old thread. I noticed that nobody replied to
>> you on this thread, may I know if you have managed to find the root cause
>> of the problem, and the solution? I seems to have similar issues with one
>> of our hypervisors. We are using CloudStack 4.2.0 and KVM. The error
>> message is similar, started with "BUG: soft lockup - CPU stuck" error
>> message. Nothing can be found on cloudstack-agent.log file.
>>
>> http://pastebin.com/4GW9yPsm
>>
>> Looking forward to your reply, thank you.
>>
>> Cheers.
>>
>>
>>
>> On Thu, Nov 14, 2013 at 10:34 AM, Timothy Ehlers <ehlerst@gmail.com
>> <mailto:ehlerst@gmail.com>> wrote:
>>
>> We are experiencing massive instability and cannot determine  whats
>> causing
>> this.
>>
>> Every so often jvsvc triggers the following in our system logs:
>>
>> Nov 13 18:59:31 cpegh0009 kernel: \[15188599.258955\] BUG: soft lockup -
>> CPU#24 stuck for 22s\! \[jsvc:60385\]
>> Nov 13 18:59:31 cpegh0009 kernel: \[15188599.266229\] Modules linked in:
>> mptctl mptbase vhost_net macvtap macvlan 8021q garp ip6table_filter
>> ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat
>> nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT
>> xt_CHECKSUM iptable_mangle xt_tcpudp iptable_filter ip_tables x_tables
>> nfsd
>> kvm_amd kvm ghash_clmulni_intel aesni_intel cryptd aes_x86_64 nfs
>> microcode
>> psmouse radeon serio_raw ttm drm_kms_helper amd64_edac_mod joydev drm
>> edac_core fam15h_power k10temp edac_mce_amd i2c_algo_bit sp5100_tco
>> i2c_piix4 hpilo hpwdt lockd bridge stp mac_hid llc fscache auth_rpcgss
>> acpi_power_meter nfs_acl bonding sunrpc lp parport hid_generic usbhid hid
>> pata_atiixp ixgbe dca hpsa mdio
>> Nov 13 18:59:31 cpegh0009 kernel: \[15188599.266322\] CPU 24
>> Nov 13 18:59:31 cpegh0009 kernel: \[15188599.266323\] Modules linked in:
>> mptctl mptbase vhost_net macvtap macvlan 8021q garp ip6table_filter
>> ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat
>> nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT
>> xt_CHECKSUM iptable_mangle xt_tcpudp iptable_filter ip_tables x_tables
>> nfsd
>> kvm_amd kvm ghash_clmulni_intel aesni_intel cryptd aes_x86_64 nfs
>> microcode
>> psmouse radeon serio_raw ttm drm_kms_helper amd64_edac_mod joydev drm
>> edac_core fam15h_power k10temp edac_mce_amd i2c_algo_bit sp5100_tco
>> i2c_piix4 hpilo hpwdt lockd bridge stp mac_hid llc fscache auth_rpcgss
>> acpi_power_meter nfs_acl bonding sunrpc lp parport hid_generic usbhid hid
>> pata_atiixp ixgbe dca hpsa mdio
>> Nov 13 18:59:31 cpegh0009 kernel: \[15188599.266378\]
>> Nov 13 18:59:31 cpegh0009 kernel: \[15188599.266382\] Pid: 60385, comm:
>> jsvc Not tainted 3.5.0-23-generic #35~precise1-Ubuntu HP ProLiant DL585
>>
>> I am not sure if this is the cause of the high load or an after effect..
>>
>> 03:25:01 PM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15 blocked 06:45:01 PM
>> 31 982 36.95 39.33 41.50 0 06:55:01 PM 17 1000 28.53 37.28 40.06 0
>> 07:05:01
>> PM 60 954 114.52 91.36 63.66 0 07:15:01 PM 48 961 29.55 53.94 60.76 0
>> 07:25:01 PM 12 895 13.23 24.64 42.47 0 07:35:01 PM 5 772 8.02 13.32 28.31
>> 0
>>
>>
>> We run ubuntu 12.04.3 LTS on HP DL585s with 64 AMD cores and .5 TB of ram.
>> This will host approx 40~50 vms (centos 5 guest).
>>
>> Agent version is:
>> Version: 1:4.0.2
>>
>> Any ideas?
>>
>> Perhaps gathering cpu usage data on the jsvc pid ?
>>
>> --
>> Tim Ehlers
>>
>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message