cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitry Berezhnoy <d.berezh...@emzior.ru>
Subject Re: Failed to get free memory without no reason
Date Fri, 26 Jul 2019 12:53:24 GMT
I'm not a Java developer too, but I'm Python developer, so I rewrote code
of getFreeMemory() to Python and found error in Java code.
When I create a new VM we take a list of all VMs from libvirt via
getFreeMemory(). After that we iterate by this list. If amount of VMs so
large(or libvirt response longer than always) this may take some time. At
that time one or more VM, which has "Running" state can start stopping by
another process(job) and not displayed in libvirt anymore. So it bring to
exception with "libvirt.libvirtError: Domain not found: no domain with
matching id 157124." message and we move on catch with wrong error
message(Failed to get free memory).
I wrote a bug report about this problem
https://github.com/apache/cloudstack/issues/3526.

пт, 26 июл. 2019 г. в 00:09, Andrija Panic <andrija.panic@gmail.com>:

> Based on this (I'm no developer...)
>
> https://github.com/apache/cloudstack/blob/4.11/plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/wrapper/LibvirtStartCommandWrapper.java#L69
>
> ...it seems like CloudStack could not fetch free mem from libvirt - i.e. I
> would check libvirt logs (you can also try to spin a VM manually via virsh
> or Virtual Machine Managed) - or simply restart libvirt daemon.
>
> Could you make a conclusion if this error always happens on some particular
> host(s) - libvirt is known to get stuck on very rare occasions - so a
> restart would help.
>
> BTW - you seems to be reaching 1.000.000 VMs created in total ??? If so -
> that's a nice number :)
>
>
> Cheers
> Andrija
>
>
>
>
> On Thu, 25 Jul 2019 at 17:10, Dmitry Berezhnoy <d.berezhnoy@emzior.ru>
> wrote:
>
> > *4.11.2.0
> >
> > чт, 25 июл. 2019 г. в 18:09, Dmitry Berezhnoy <d.berezhnoy@emzior.ru>:
> >
> > > First log messages from 10.11.2.0
> > >
> > > чт, 25 июл. 2019 г. в 17:56, Andrija Panic <andrija.panic@gmail.com>:
> > >
> > >> FYI,4.10 is so broken that it's not considered an official releases
> > (check
> > >> yourself on the GitHub) - please upgrade to 4.11.3 if possible.
> > >>
> > >> Will check logs later today.
> > >>
> > >> Andrija
> > >>
> > >> On Thu, 25 Jul 2019 at 16:48, Dmitry Berezhnoy <d.berezhnoy@emzior.ru
> >
> > >> wrote:
> > >>
> > >> > Log from that CloudStack station really huge, so I download log from
> > >> > another CloudStack(version 4.10.0) with the same errors.
> > logid:f0c4abda
> > >> for
> > >> > example.
> > >> >
> > >> > 2019-07-25 15:57:58,895 DEBUG [c.c.c.CapacityManagerImpl]
> > >> > (API-Job-Executor-149:ctx-d621076d job-6654658 ctx-d5e8eff7
> > >> > FirstFitRoutingAllocator) (logid:f0c4abda) Free RAM: 88188272640 ,
> > >> > Requested RAM: 402653184
> > >> > 2019-07-25 15:57:58,895 DEBUG [c.c.c.CapacityManagerImpl]
> > >> > (API-Job-Executor-149:ctx-d621076d job-6654658 ctx-d5e8eff7
> > >> > FirstFitRoutingAllocator) (logid:f0c4abda) Host has enough CPU and
> RAM
> > >> > available
> > >> > 2019-07-25 15:57:58,895 DEBUG [c.c.c.CapacityManagerImpl]
> > >> > (API-Job-Executor-149:ctx-d621076d job-6654658 ctx-d5e8eff7
> > >> > FirstFitRoutingAllocator) (logid:f0c4abda) STATS: Can alloc CPU from
> > >> host:
> > >> > 1, used: 48556, reserved: 0, actual total: 144000, total with
> > >> > overprovisioning: 144000; requested
> > cpu:384,alloc_from_last_host?:false
> > >> > ,considerReservedCapacity?: true
> > >> > 2019-07-25 15:57:58,895 DEBUG [c.c.c.CapacityManagerImpl]
> > >> > (API-Job-Executor-149:ctx-d621076d job-6654658 ctx-d5e8eff7
> > >> > FirstFitRoutingAllocator) (logid:f0c4abda) STATS: Can alloc MEM from
> > >> host:
> > >> > 1, used: 45902462976, reserved: 0, total: 134090735616; requested
> mem:
> > >> > 402653184,alloc_from_last_host?:false ,considerReservedCapacity?:
> true
> > >> > ...
> > >> > 2019-07-25 15:58:07,660 INFO  [c.c.v.VirtualMachineManagerImpl]
> > >> > (Work-Job-Executor-27:ctx-bce5a678 job-6654658/job-6654659
> > ctx-fe95df77)
> > >> > (logid:f0c4abda) Unable to start VM on Host[-1-Routing] due to
> failed
> > to
> > >> > get free memory
> > >> > ...
> > >> > 2019-07-25 15:58:14,421 ERROR [c.c.v.VmWorkJobHandlerProxy]
> > >> > (Work-Job-Executor-27:ctx-bce5a678 job-6654658/job-6654659
> > ctx-fe95df77)
> > >> > (logid:f0c4abda) Invocation exception, caused by:
> > >> > com.cloud.exception.InsufficientServerCapacityException: Unable to
> > >> create a
> > >> > deployment for VM[User|i-2-977270-VM]Scope=interface
> > >> > com.cloud.dc.DataCenter; id=3
> > >> >
> > >> > management-server.log https://yadi.sk/d/wzyZVeTKS2kkDQ
> > >> >
> > >> > чт, 25 июл. 2019 г. в 16:51, Andrija Panic <andrija.panic@gmail.com
> >:
> > >> >
> > >> > > Can you share the whole log file (pastebin.org please, or
> similar)
> > ?
> > >> > >
> > >> > > On Wed, 24 Jul 2019 at 17:34, Dmitry Berezhnoy <
> > d.berezhnoy@emzior.ru
> > >> >
> > >> > > wrote:
> > >> > >
> > >> > > > Hello,
> > >> > > >
> > >> > > > Asynchronous creating VM bring to "Unable to start VM on
> > >> > Host[-1-Routing]
> > >> > > > due to failed to get free memory". After that I see ERROR:
> > >> > > > InsufficientServerCapacityException.
> > >> > > > In previous messages I see large amount of free resources:
> > >> > > > 2019-07-24 04:50:45,443 DEBUG [c.c.c.CapacityManagerImpl]
> > >> > > > (Work-Job-Executor-82:ctx-8d1155a3 job-2281443/job-2281455
> > >> > ctx-52b48ca6)
> > >> > > > (logid:b6e33d80) Current Used CPU: 115708 , Free CPU:76356
> > >> ,Requested
> > >> > > CPU:
> > >> > > > 512
> > >> > > > 2019-07-24 04:50:45,443 DEBUG [c.c.c.CapacityManagerImpl]
> > >> > > > (Work-Job-Executor-82:ctx-8d1155a3 job-2281443/job-2281455
> > >> > ctx-52b48ca6)
> > >> > > > (logid:b6e33d80) Current Used RAM: 122003914752 , Free
> > >> RAM:145667670016
> > >> > > > ,Requested RAM: 536870912
> > >> > > > 2019-07-24 04:50:45,443 DEBUG [c.c.c.CapacityManagerImpl]
> > >> > > > (Work-Job-Executor-82:ctx-8d1155a3 job-2281443/job-2281455
> > >> > ctx-52b48ca6)
> > >> > > > (logid:b6e33d80) CPU STATS after allocation: for host: 1,
old
> > used:
> > >> > > 115708,
> > >> > > > old reserved: 1536, actual total: 193600, total with
> > >> overprovisioning:
> > >> > > > 193600; new used:116220, reserved:1536; requested
> > >> > > > cpu:512,alloc_from_last:false
> > >> > > > 2019-07-24 04:50:45,443 DEBUG [c.c.c.CapacityManagerImpl]
> > >> > > > (Work-Job-Executor-82:ctx-8d1155a3 job-2281443/job-2281455
> > >> > ctx-52b48ca6)
> > >> > > > (logid:b6e33d80) RAM STATS after allocation: for host: 1,
old
> > used:
> > >> > > > 122003914752, old reserved: 1610612736, total: 269282197504;
new
> > >> used:
> > >> > > > 122540785664, reserved: 1610612736; requested mem:
> > >> > > > 536870912,alloc_from_last:false
> > >> > > >
> > >> > > > How it possible?
> > >> > > >
> > >> > > > Thanks in advance for the help,
> > >> > > > Dmitry.
> > >> > > >
> > >> > >
> > >> > >
> > >> > > --
> > >> > >
> > >> > > Andrija Panić
> > >> > >
> > >> >
> > >>
> > >>
> > >> --
> > >>
> > >> Andrija Panić
> > >>
> > >
> >
>
>
> --
>
> Andrija Panić
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message