cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anthony Xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CLOUDSTACK-7857) CitrixResourceBase wrongly calculates total memory on hosts with a lot of memory and large Dom0
Date Fri, 14 Nov 2014 22:58:34 GMT

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14213016#comment-14213016
] 

Anthony Xu commented on CLOUDSTACK-7857:
----------------------------------------

_xs_memory_used is used as memory virtualization overhead in this XS host, but the memory
overhead varies a lot depending on the total host free memory, VM density, VM memory size,
VM guest OS type ...

to me, there seems no way to know the precise memory virtualization overhead before you use
the host to run VMs,


There are two ways CloudStack provides to mitigate this,

1. retry mechanism, 
   cloudstack uses retry in many places, like deployVM, startVM, migrateVM, 
2. threshold,
 cluster.memory.allocated.capacity.disablethreshold
you can use this per-cluster configuration to configure the free memory which can be used
by CloudStack.

If you have other thought on this, please share with us.

Anthony










> CitrixResourceBase wrongly calculates total memory on hosts with a lot of memory and
large Dom0
> -----------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-7857
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7857
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>    Affects Versions: Future, 4.3.0, 4.4.0, 4.5.0, 4.3.1, 4.4.1, 4.6.0
>            Reporter: Joris van Lieshout
>            Priority: Blocker
>
> We have hosts with 256GB memory and 4GB dom0. During startup ACS calculates available
memory using this formula:
> CitrixResourceBase.java
> 	protected void fillHostInfo
> 		ram = (long) ((ram - dom0Ram - _xs_memory_used) * _xs_virtualization_factor);
> In our situation:
> 	ram = 274841497600
> 	dom0Ram = 4269801472
> 	_xs_memory_used = 128 * 1024 * 1024L = 134217728
> 	_xs_virtualization_factor = 63.0/64.0 = 0,984375
> 	(274841497600 - 4269801472 - 134217728) * 0,984375 = 266211892800
> This is in fact not the actual amount of memory available for instances. The difference
in our situation is a little less then 1GB. On this particular hypervisor Dom0+Xen uses about
9GB.
> As the comment above the definition of XsMemoryUsed allready stated it's time to review
this logic. 
> "//Hypervisor specific params with generic value, may need to be overridden for specific
versions"
> The effect of this bug is that when you put a hypervisor in maintenance it might try
to move instances (usually small instances (<1GB)) to a host that in fact does not have
enought free memory.
> This exception is thrown:
> ERROR [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-09aca6e9 work-8981) Terminating
HAWork[8981-Migration-4482-Running-Migrating]
> com.cloud.utils.exception.CloudRuntimeException: Unable to migrate due to Catch Exception
com.cloud.utils.exception.CloudRuntimeException: Migration failed due to com.cloud.utils.exception.CloudRuntim
> eException: Unable to migrate VM(r-4482-VM) from host(6805d06c-4d5b-4438-a245-7915e93041d9)
due to Task failed! Task record:                 uuid: 645b63c8-1426-b412-7b6a-13d61ee7ab2e
>            nameLabel: Async.VM.pool_migrate
>      nameDescription: 
>    allowedOperations: []
>    currentOperations: {}
>              created: Thu Nov 06 13:44:14 CET 2014
>             finished: Thu Nov 06 13:44:14 CET 2014
>               status: failure
>           residentOn: com.xensource.xenapi.Host@b42882c6
>             progress: 1.0
>                 type: <none/>
>               result: 
>            errorInfo: [HOST_NOT_ENOUGH_FREE_MEMORY, 272629760, 263131136]
>          otherConfig: {}
>            subtaskOf: com.xensource.xenapi.Task@aaf13f6f
>             subtasks: []
>         at com.cloud.vm.VirtualMachineManagerImpl.migrate(VirtualMachineManagerImpl.java:1840)
>         at com.cloud.vm.VirtualMachineManagerImpl.migrateAway(VirtualMachineManagerImpl.java:2214)
>         at com.cloud.ha.HighAvailabilityManagerImpl.migrate(HighAvailabilityManagerImpl.java:610)
>         at com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.runWithContext(HighAvailabilityManagerImpl.java:865)
>         at com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.access$000(HighAvailabilityManagerImpl.java:822)
>         at com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread$1.run(HighAvailabilityManagerImpl.java:834)
>         at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>         at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>         at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>         at com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.run(HighAvailabilityManagerImpl.java:831)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message