cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From anil lakineni <anilkumar459.lakin...@gmail.com>
Subject Re: Fault percentage value of CPU usage in Cloud Platform
Date Thu, 24 Nov 2016 09:14:39 GMT
Dear Will,

Good Afternoon, i hope everything is fine at your end.

Please find my comments for your questions,

- do you have VMs allocated, but turned off? They will count towards the
provisioned CPU even though they are not running because they could be
started at any time and are expecting to have the resources to start.

*Yes, i have few VMs which are in shutdown state. But, AFAIK the CPU will
not be countable towards the VPS's which are in shutdown state. Because,
when i turned OFF/ON any VPS the CPU allocated and percentage of that
allocated values are changing accordingly. *


- do you have more than one cluster? The dashboard only shows the most used
cluster, but if you drill down it shows the whole environments resources,
so if you have more than one cluster, that could explain the difference.
*Yes, i have two clusters. if you see my previous e-mails i was already
mentioned that i can see the true allocated value in the DASHBOARD (i.e.,
800GHz/2000GHz) and the same value in the whole resources (Zone level) as
well. But when it comes to percentage value, the DASH BOARD value is
showing wrong (91%) value where as in the whole resources tab the value
showing is 40% and it's correct since mathematically the percentage of
800/2000 gives us 40%. *
*Here, the issue is with the percentage of allocated CPU value in the DASH
BOARD. Why it is showing wrong? and it causing us to fail the deployments
(since the cloud platform is verifying the percentage of allocated CPU
value what is there in the DASHBOARD not from the whole resources tab).*

- are you trying to deploy to a specific cluster with a service offering
tag? SvcOffering:WinL? Is that the most used cluster?
*Yes, to the second cluster (WinL tag) i'm trying. And the two clusters are
almost using in the same ratio.*


Is it a bug? my Cloud Version is 4.5.
Do i need to restart any services in the management server to get the
actual percentage value at DASH BOARD?
Do i need to hack the DataBase for changes?

*Please let me know if you need more information to help me on issue
resolving. Thanks.*

Best Regards,
Anil.

On Tue, Nov 22, 2016 at 3:22 PM, Will Stevens <williamstevens@gmail.com>
wrote:

> A couple things.
> - do you have VMs allocated, but turned off? They will count towards the
> provisioned CPU even though they are not running because they could be
> started at any time and are expecting to have the resources to start.
> - do you have more than one cluster? The dashboard only shows the most used
> cluster, but if you drill down it shows the whole environments resources,
> so if you have more than one cluster, that could explain the difference.
> - are you trying to deploy to a specific cluster with a service offering
> tag? SvcOffering:WinL? Is that the most used cluster?
>
> Let us know.
>
> On Nov 22, 2016 6:51 AM, "anil lakineni" <anilkumar459.lakineni@gmail.com>
> wrote:
>
> > Hi Sudharma,
> >
> > I verified the management server logs when the VPS got failed to deploy
> and
> > i found that the value of CPU is exceeding than the threshold value So
> that
> > VPS deployment has been failed.
> > Then i have changed the CPU disable & alert threshold value to above 90%
> > and i was able to deploy the VPS.
> >
> > Please check *http://pastebin.com/irrS0TTg <http://pastebin.com/irrS0TTg
> >*
> > for the management server log when the VM deployment was failed.
> >
> > *The brief content of the log is-*
> >
> > 2016-11-17 12:46:34,100 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> (logid:393001e5)
> > DeploymentPlanner allocation algorithm:
> > com.cloud.deploy.FirstFitPlanner@5a32f393
> > 2016-11-17 12:46:34,101 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> (logid:393001e5)
> > Trying to allocate a host and storage pools from dc:1,
> > pod:null,cluster:null, requested cpu: 38400, requested ram: 68719476736
> > 2016-11-17 12:46:34,101 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> (logid:393001e5)
> > Is ROOT volume READY (pool already allocated)?: No
> > 2016-11-17 12:46:34,101 DEBUG [c.c.d.FirstFitPlanner]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> (logid:393001e5)
> > Searching all possible resources under this Zone: 1
> > 2016-11-17 12:46:34,104 DEBUG [c.c.d.FirstFitPlanner]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> (logid:393001e5)
> > Listing pods in order of aggregate capacity, that have (atleast one host
> > with) enough CPU and RAM capacity under this Zone: 1
> > 2016-11-17 12:46:34,111 DEBUG [c.c.d.FirstFitPlanner]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> (logid:393001e5)
> > Removing from the podId list these pods from avoid set: []
> > 2016-11-17 12:46:34,115 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentManager-Handler-14:null) (logid:) SeqA 27-149419: Processing Seq
> > 27-149419:  { Cmd , MgmtId: -1, via: 27, Ver: v1, Flags: 11,
> > [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand"
> > :{"_proxyVmId":519,"_loadInfo":"{\n
> >  \"connections\": []\n}","wait":0}}] }
> > 2016-11-17 12:46:34,124 DEBUG [c.c.d.FirstFitPlanner]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> (logid:393001e5)
> > Checking resources under Pod: 1
> > 2016-11-17 12:46:34,125 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentManager-Handler-14:null) (logid:) SeqA 27-149419: Sending Seq
> > 27-149419:  { Ans: , MgmtId: 47019105324719, via: 27, Ver: v1, Flags:
> > 100010,
> > [{"com.cloud.agent.api.AgentControlAnswer":{"result":true,"wait":0}}] }
> > 2016-11-17 12:46:34,126 DEBUG [c.c.d.FirstFitPlanner]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> (logid:393001e5)
> > Listing clusters in order of aggregate capacity, that have (atleast one
> > host with) enough CPU and RAM capacity under this Pod: 1
> > 2016-11-17 12:46:34,133 DEBUG [c.c.d.FirstFitPlanner]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> (logid:393001e5)
> > Removing from the clusterId list these clusters from avoid set: []
> > 2016-11-17 12:46:34,141 DEBUG [c.c.d.FirstFitPlanner]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> > (logid:393001e5) *Cannot
> > allocate cluster list [5] for vm creation since their allocated
> percentage
> > crosses the disable capacity threshold defined at each cluster/ at global
> > value for capacity Type : 1, skipping these clusters*
> > 2016-11-17 12:46:34,156 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> (logid:393001e5)
> > Checking resources in Cluster: 1 under Pod: 1
> > 2016-11-17 12:46:34,156 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf
> > FirstFitRoutingAllocator) (logid:393001e5) Looking for hosts in dc: 1
> >  pod:1  cluster:1
> > 2016-11-17 12:46:34,156 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf
> > FirstFitRoutingAllocator) (logid:393001e5) Looking for hosts having tag
> > specified on SvcOffering:WinL
> > 2016-11-17 12:46:34,159 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf
> > FirstFitRoutingAllocator) (logid:393001e5) Hosts with tag 'WinL' are:[]
> > 2016-11-17 12:46:34,163 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf
> > FirstFitRoutingAllocator) (logid:393001e5) FirstFitAllocator has 0 hosts
> to
> > check for allocation: []
> > 2016-11-17 12:46:34,170 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf
> > FirstFitRoutingAllocator) (logid:393001e5) Found 0 hosts for allocation
> > after prioritization: []
> > 2016-11-17 12:46:34,170 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf
> > FirstFitRoutingAllocator) (logid:393001e5) Looking for speed=38400Mhz,
> > Ram=65536
> > 2016-11-17 12:46:34,170 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf
> > FirstFitRoutingAllocator) (logid:393001e5) Host Allocator returning 0
> > suitable hosts
> > 2016-11-17 12:46:34,170 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> (logid:393001e5)
> > No suitable hosts found
> > 2016-11-17 12:46:34,170 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> (logid:393001e5)
> > No suitable hosts found under this Cluster: 1
> > 2016-11-17 12:46:34,174 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> (logid:393001e5)
> > Could not find suitable Deployment Destination for this VM under any
> > clusters, returning.
> > 2016-11-17 12:46:34,174 DEBUG [c.c.d.FirstFitPlanner]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> (logid:393001e5)
> > Searching all possible resources under this Zone: 1
> > 2016-11-17 12:46:34,177 DEBUG [c.c.d.FirstFitPlanner]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> (logid:393001e5)
> > Listing pods in order of aggregate capacity, that have (atleast one host
> > with) enough CPU and RAM capacity under this Zone: 1
> > 2016-11-17 12:46:34,184 DEBUG [c.c.d.FirstFitPlanner]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> (logid:393001e5)
> > Removing from the podId list these pods from avoid set: []
> > 2016-11-17 12:46:34,188 DEBUG [c.c.d.FirstFitPlanner]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> (logid:393001e5)
> > Checking resources under Pod: 1
> > 2016-11-17 12:46:34,189 DEBUG [c.c.d.FirstFitPlanner]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> (logid:393001e5)
> > Listing clusters in order of aggregate capacity, that have (atleast one
> > host with) enough CPU and RAM capacity under this Pod: 1
> > 2016-11-17 12:46:34,196 DEBUG [c.c.d.FirstFitPlanner]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> (logid:393001e5)
> > Removing from the clusterId list these clusters from avoid set: [1]
> > 2016-11-17 12:46:34,205 DEBUG [c.c.d.FirstFitPlanner]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> > (logid:393001e5) *Cannot
> > allocate cluster list [5] for vm creation since their allocated
> percentage
> > crosses the disable capacity threshold defined at each cluster/ at global
> > value for capacity Type : 1, skipping these clusters*
> > 2016-11-17 12:46:34,205 DEBUG [c.c.d.FirstFitPlanner]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> (logid:393001e5)
> > No clusters found after removing disabled clusters and clusters in avoid
> > list, returning.
> > 2016-11-17 12:46:34,212 DEBUG [c.c.v.UserVmManagerImpl]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> (logid:393001e5)
> > Destroying vm VM[User|i-91-736-VM] as it failed to create on Host with
> > Id:null
> > 2016-11-17 12:46:34,252 DEBUG [c.c.c.CapacityManagerImpl]
> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
> (logid:393001e5)
> > VM state transitted from :Stopped to Error with event:
> > OperationFailedToErrorvm's original host id: null new host id: null host
> id
> > before state transition: null
> >
> >
> > Please let me know if you require more information, i will provide you.
> >
> > Best Regards,
> > Anil.
> >
> > On Tue, Nov 22, 2016 at 12:10 PM, Sudharma Jain <sudharma.sid@gmail.com>
> > wrote:
> >
> > > Hi Anil,
> > >
> > > There could be a bug with the dashboard, but it has nothing to do with
> > the
> > > deployment failure. Check your management server logs.
> > >
> > > Thanks,
> > > Sudharma
> > >
> > > On Tue, Nov 22, 2016 at 1:25 PM, anil lakineni <
> > > anilkumar459.lakineni@gmail.com> wrote:
> > >
> > > > Good Morning,
> > > >
> > > > @Will- but we don't have support contract.
> > > >
> > > > @Bharat- True, but the allocated CPU percentage value is showing
> wrong
> > in
> > > > the Dashboard where as in Zone's Resources *(Path is:
> 'Infrastructure'
> > ->
> > > > 'Zones' -> 'click on desired zone name' -> 'Resources') *the
> percentage
> > > > value is showing correct.
> > > >
> > > > Total CPU allocated is 800 GHz out of 2000 GHz. So that means the
> > > > percentage value should be in 40% range but in my case it is showing
> > 91%
> > > in
> > > > the Dashboard which leads in failing new deployments. But, the same
> > value
> > > > in Zone's Resources is showing accurate 40% value.
> > > >
> > > >
> > > > But, for new VPS or VM deployments the cloud is preferring dashboard
> > > > percentage value not the one which is there at Zone's Resources. So
> > would
> > > > you help me to fix this issue?
> > > >
> > > >
> > > > Best Regards,
> > > > Anil.
> > > >
> > > > On Mon, Nov 21, 2016 at 7:20 PM, Bharat Kumar <
> > > bharat.kumar@accelerite.com
> > > > >
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > There may be a difference in what you have allocated and what is
> > being
> > > > > actually used. The dashboard shows what is allocated.
> > > > >
> > > > > Regards,
> > > > > Bharat.
> > > > >
> > > > > On 11/21/16, 9:44 PM, "williamstevens@gmail.com on behalf of Will
> > > > > Stevens" <williamstevens@gmail.com on behalf of
> > wstevens@cloudops.com>
> > > > > wrote:
> > > > >
> > > > > >You will have to contact Accelerite for support with ACP
> (previously
> > > > CCP).
> > > > > >We have no visibility into the ACP code or how to support you.
> > > > > >
> > > > > >https://support.accelerite.com/hc/en-us
> > > > > >
> > > > > >Best of luck...
> > > > > >
> > > > > >*Will STEVENS*
> > > > > >Lead Developer
> > > > > >
> > > > > ><https://goo.gl/NYZ8KK>
> > > > > >
> > > > > >On Mon, Nov 21, 2016 at 3:44 AM, anil lakineni <
> > > > > >anilkumar459.lakineni@gmail.com> wrote:
> > > > > >
> > > > > >> Dear All,
> > > > > >>
> > > > > >> On CloudPlatform dashboard our CPU usage is showing wrong
(high
> > > -91%)
> > > > > value
> > > > > >> which in-turn not allowing us to provision new VMs. But,
the
> fact
> > is
> > > > > only
> > > > > >> 40% of the available CPU is utilized and Even in the Dashboard
> > only
> > > > > >> percentage calculation is showing false metric value, But
Cpu
> > usage
> > > > > value
> > > > > >> is showing accurate(800/2000 GHZ).
> > > > > >>
> > > > > >> In addition to that when we go to check the CPU status at
Zones
> > > level
> > > > we
> > > > > >> are seeing the accurate CPU usage percentage in all Zones,
Only
> we
> > > are
> > > > > >> getting false usage percentage at dashboard level(which
leads to
> > > fail
> > > > > the
> > > > > >> new deployments).
> > > > > >>
> > > > > >> - Our CCP version is 4.5.0
> > > > > >> - Hypervisors are Xen 6.2 & 6.5
> > > > > >>
> > > > > >> Please help me to sort out this issue and also let me know
if
> any
> > > > > >> additional information needed.
> > > > > >>
> > > > > >>
> > > > > >> Best Regards,
> > > > > >> Anil.
> > > > > >>
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > DISCLAIMER
> > > > > ==========
> > > > > This e-mail may contain privileged and confidential information
> which
> > > is
> > > > > the property of Accelerite, a Persistent Systems business. It is
> > > intended
> > > > > only for the use of the individual or entity to which it is
> > addressed.
> > > If
> > > > > you are not the intended recipient, you are not authorized to read,
> > > > retain,
> > > > > copy, print, distribute or use this message. If you have received
> > this
> > > > > communication in error, please notify the sender and delete all
> > copies
> > > of
> > > > > this message. Accelerite, a Persistent Systems business does not
> > accept
> > > > any
> > > > > liability for virus infected mails.
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message