cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From anil lakineni <anilkumar459.lakin...@gmail.com>
Subject Re: Fault percentage value of CPU usage in Cloud Platform
Date Sun, 27 Nov 2016 06:18:11 GMT
Hello Guys,

Can someone give solution to my issue?

Looking for help.

Best Regards,
Anil.


On Thu, Nov 24, 2016 at 12:14 PM, anil lakineni <
anilkumar459.lakineni@gmail.com> wrote:

> Dear Will,
>
> Good Afternoon, i hope everything is fine at your end.
>
> Please find my comments for your questions,
>
> - do you have VMs allocated, but turned off? They will count towards the
> provisioned CPU even though they are not running because they could be
> started at any time and are expecting to have the resources to start.
>
> *Yes, i have few VMs which are in shutdown state. But, AFAIK the CPU will
> not be countable towards the VPS's which are in shutdown state. Because,
> when i turned OFF/ON any VPS the CPU allocated and percentage of that
> allocated values are changing accordingly. *
>
>
> - do you have more than one cluster? The dashboard only shows the most used
> cluster, but if you drill down it shows the whole environments resources,
> so if you have more than one cluster, that could explain the difference.
> *Yes, i have two clusters. if you see my previous e-mails i was already
> mentioned that i can see the true allocated value in the DASHBOARD (i.e.,
> 800GHz/2000GHz) and the same value in the whole resources (Zone level) as
> well. But when it comes to percentage value, the DASH BOARD value is
> showing wrong (91%) value where as in the whole resources tab the value
> showing is 40% and it's correct since mathematically the percentage of
> 800/2000 gives us 40%. *
> *Here, the issue is with the percentage of allocated CPU value in the DASH
> BOARD. Why it is showing wrong? and it causing us to fail the deployments
> (since the cloud platform is verifying the percentage of allocated CPU
> value what is there in the DASHBOARD not from the whole resources tab).*
>
> - are you trying to deploy to a specific cluster with a service offering
> tag? SvcOffering:WinL? Is that the most used cluster?
> *Yes, to the second cluster (WinL tag) i'm trying. And the two clusters
> are almost using in the same ratio.*
>
>
> Is it a bug? my Cloud Version is 4.5.
> Do i need to restart any services in the management server to get the
> actual percentage value at DASH BOARD?
> Do i need to hack the DataBase for changes?
>
> *Please let me know if you need more information to help me on issue
> resolving. Thanks.*
>
> Best Regards,
> Anil.
>
> On Tue, Nov 22, 2016 at 3:22 PM, Will Stevens <williamstevens@gmail.com>
> wrote:
>
>> A couple things.
>> - do you have VMs allocated, but turned off? They will count towards the
>> provisioned CPU even though they are not running because they could be
>> started at any time and are expecting to have the resources to start.
>> - do you have more than one cluster? The dashboard only shows the most
>> used
>> cluster, but if you drill down it shows the whole environments resources,
>> so if you have more than one cluster, that could explain the difference.
>> - are you trying to deploy to a specific cluster with a service offering
>> tag? SvcOffering:WinL? Is that the most used cluster?
>>
>> Let us know.
>>
>> On Nov 22, 2016 6:51 AM, "anil lakineni" <anilkumar459.lakineni@gmail.com
>> >
>> wrote:
>>
>> > Hi Sudharma,
>> >
>> > I verified the management server logs when the VPS got failed to deploy
>> and
>> > i found that the value of CPU is exceeding than the threshold value So
>> that
>> > VPS deployment has been failed.
>> > Then i have changed the CPU disable & alert threshold value to above 90%
>> > and i was able to deploy the VPS.
>> >
>> > Please check *http://pastebin.com/irrS0TTg <
>> http://pastebin.com/irrS0TTg>*
>> > for the management server log when the VM deployment was failed.
>> >
>> > *The brief content of the log is-*
>> >
>> > 2016-11-17 12:46:34,100 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> (logid:393001e5)
>> > DeploymentPlanner allocation algorithm:
>> > com.cloud.deploy.FirstFitPlanner@5a32f393
>> > 2016-11-17 12:46:34,101 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> (logid:393001e5)
>> > Trying to allocate a host and storage pools from dc:1,
>> > pod:null,cluster:null, requested cpu: 38400, requested ram: 68719476736
>> > 2016-11-17 12:46:34,101 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> (logid:393001e5)
>> > Is ROOT volume READY (pool already allocated)?: No
>> > 2016-11-17 12:46:34,101 DEBUG [c.c.d.FirstFitPlanner]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> (logid:393001e5)
>> > Searching all possible resources under this Zone: 1
>> > 2016-11-17 12:46:34,104 DEBUG [c.c.d.FirstFitPlanner]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> (logid:393001e5)
>> > Listing pods in order of aggregate capacity, that have (atleast one host
>> > with) enough CPU and RAM capacity under this Zone: 1
>> > 2016-11-17 12:46:34,111 DEBUG [c.c.d.FirstFitPlanner]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> (logid:393001e5)
>> > Removing from the podId list these pods from avoid set: []
>> > 2016-11-17 12:46:34,115 DEBUG [c.c.a.m.AgentManagerImpl]
>> > (AgentManager-Handler-14:null) (logid:) SeqA 27-149419: Processing Seq
>> > 27-149419:  { Cmd , MgmtId: -1, via: 27, Ver: v1, Flags: 11,
>> > [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand"
>> > :{"_proxyVmId":519,"_loadInfo":"{\n
>> >  \"connections\": []\n}","wait":0}}] }
>> > 2016-11-17 12:46:34,124 DEBUG [c.c.d.FirstFitPlanner]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> (logid:393001e5)
>> > Checking resources under Pod: 1
>> > 2016-11-17 12:46:34,125 DEBUG [c.c.a.m.AgentManagerImpl]
>> > (AgentManager-Handler-14:null) (logid:) SeqA 27-149419: Sending Seq
>> > 27-149419:  { Ans: , MgmtId: 47019105324719, via: 27, Ver: v1, Flags:
>> > 100010,
>> > [{"com.cloud.agent.api.AgentControlAnswer":{"result":true,"wait":0}}] }
>> > 2016-11-17 12:46:34,126 DEBUG [c.c.d.FirstFitPlanner]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> (logid:393001e5)
>> > Listing clusters in order of aggregate capacity, that have (atleast one
>> > host with) enough CPU and RAM capacity under this Pod: 1
>> > 2016-11-17 12:46:34,133 DEBUG [c.c.d.FirstFitPlanner]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> (logid:393001e5)
>> > Removing from the clusterId list these clusters from avoid set: []
>> > 2016-11-17 12:46:34,141 DEBUG [c.c.d.FirstFitPlanner]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> > (logid:393001e5) *Cannot
>> > allocate cluster list [5] for vm creation since their allocated
>> percentage
>> > crosses the disable capacity threshold defined at each cluster/ at
>> global
>> > value for capacity Type : 1, skipping these clusters*
>> > 2016-11-17 12:46:34,156 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> (logid:393001e5)
>> > Checking resources in Cluster: 1 under Pod: 1
>> > 2016-11-17 12:46:34,156 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf
>> > FirstFitRoutingAllocator) (logid:393001e5) Looking for hosts in dc: 1
>> >  pod:1  cluster:1
>> > 2016-11-17 12:46:34,156 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf
>> > FirstFitRoutingAllocator) (logid:393001e5) Looking for hosts having tag
>> > specified on SvcOffering:WinL
>> > 2016-11-17 12:46:34,159 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf
>> > FirstFitRoutingAllocator) (logid:393001e5) Hosts with tag 'WinL' are:[]
>> > 2016-11-17 12:46:34,163 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf
>> > FirstFitRoutingAllocator) (logid:393001e5) FirstFitAllocator has 0
>> hosts to
>> > check for allocation: []
>> > 2016-11-17 12:46:34,170 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf
>> > FirstFitRoutingAllocator) (logid:393001e5) Found 0 hosts for allocation
>> > after prioritization: []
>> > 2016-11-17 12:46:34,170 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf
>> > FirstFitRoutingAllocator) (logid:393001e5) Looking for speed=38400Mhz,
>> > Ram=65536
>> > 2016-11-17 12:46:34,170 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf
>> > FirstFitRoutingAllocator) (logid:393001e5) Host Allocator returning 0
>> > suitable hosts
>> > 2016-11-17 12:46:34,170 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> (logid:393001e5)
>> > No suitable hosts found
>> > 2016-11-17 12:46:34,170 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> (logid:393001e5)
>> > No suitable hosts found under this Cluster: 1
>> > 2016-11-17 12:46:34,174 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> (logid:393001e5)
>> > Could not find suitable Deployment Destination for this VM under any
>> > clusters, returning.
>> > 2016-11-17 12:46:34,174 DEBUG [c.c.d.FirstFitPlanner]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> (logid:393001e5)
>> > Searching all possible resources under this Zone: 1
>> > 2016-11-17 12:46:34,177 DEBUG [c.c.d.FirstFitPlanner]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> (logid:393001e5)
>> > Listing pods in order of aggregate capacity, that have (atleast one host
>> > with) enough CPU and RAM capacity under this Zone: 1
>> > 2016-11-17 12:46:34,184 DEBUG [c.c.d.FirstFitPlanner]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> (logid:393001e5)
>> > Removing from the podId list these pods from avoid set: []
>> > 2016-11-17 12:46:34,188 DEBUG [c.c.d.FirstFitPlanner]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> (logid:393001e5)
>> > Checking resources under Pod: 1
>> > 2016-11-17 12:46:34,189 DEBUG [c.c.d.FirstFitPlanner]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> (logid:393001e5)
>> > Listing clusters in order of aggregate capacity, that have (atleast one
>> > host with) enough CPU and RAM capacity under this Pod: 1
>> > 2016-11-17 12:46:34,196 DEBUG [c.c.d.FirstFitPlanner]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> (logid:393001e5)
>> > Removing from the clusterId list these clusters from avoid set: [1]
>> > 2016-11-17 12:46:34,205 DEBUG [c.c.d.FirstFitPlanner]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> > (logid:393001e5) *Cannot
>> > allocate cluster list [5] for vm creation since their allocated
>> percentage
>> > crosses the disable capacity threshold defined at each cluster/ at
>> global
>> > value for capacity Type : 1, skipping these clusters*
>> > 2016-11-17 12:46:34,205 DEBUG [c.c.d.FirstFitPlanner]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> (logid:393001e5)
>> > No clusters found after removing disabled clusters and clusters in avoid
>> > list, returning.
>> > 2016-11-17 12:46:34,212 DEBUG [c.c.v.UserVmManagerImpl]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> (logid:393001e5)
>> > Destroying vm VM[User|i-91-736-VM] as it failed to create on Host with
>> > Id:null
>> > 2016-11-17 12:46:34,252 DEBUG [c.c.c.CapacityManagerImpl]
>> > (API-Job-Executor-22:ctx-f48bbb10 job-98412 ctx-daf38dbf)
>> (logid:393001e5)
>> > VM state transitted from :Stopped to Error with event:
>> > OperationFailedToErrorvm's original host id: null new host id: null
>> host id
>> > before state transition: null
>> >
>> >
>> > Please let me know if you require more information, i will provide you.
>> >
>> > Best Regards,
>> > Anil.
>> >
>> > On Tue, Nov 22, 2016 at 12:10 PM, Sudharma Jain <sudharma.sid@gmail.com
>> >
>> > wrote:
>> >
>> > > Hi Anil,
>> > >
>> > > There could be a bug with the dashboard, but it has nothing to do with
>> > the
>> > > deployment failure. Check your management server logs.
>> > >
>> > > Thanks,
>> > > Sudharma
>> > >
>> > > On Tue, Nov 22, 2016 at 1:25 PM, anil lakineni <
>> > > anilkumar459.lakineni@gmail.com> wrote:
>> > >
>> > > > Good Morning,
>> > > >
>> > > > @Will- but we don't have support contract.
>> > > >
>> > > > @Bharat- True, but the allocated CPU percentage value is showing
>> wrong
>> > in
>> > > > the Dashboard where as in Zone's Resources *(Path is:
>> 'Infrastructure'
>> > ->
>> > > > 'Zones' -> 'click on desired zone name' -> 'Resources') *the
>> percentage
>> > > > value is showing correct.
>> > > >
>> > > > Total CPU allocated is 800 GHz out of 2000 GHz. So that means the
>> > > > percentage value should be in 40% range but in my case it is showing
>> > 91%
>> > > in
>> > > > the Dashboard which leads in failing new deployments. But, the same
>> > value
>> > > > in Zone's Resources is showing accurate 40% value.
>> > > >
>> > > >
>> > > > But, for new VPS or VM deployments the cloud is preferring dashboard
>> > > > percentage value not the one which is there at Zone's Resources. So
>> > would
>> > > > you help me to fix this issue?
>> > > >
>> > > >
>> > > > Best Regards,
>> > > > Anil.
>> > > >
>> > > > On Mon, Nov 21, 2016 at 7:20 PM, Bharat Kumar <
>> > > bharat.kumar@accelerite.com
>> > > > >
>> > > > wrote:
>> > > >
>> > > > > Hi,
>> > > > >
>> > > > > There may be a difference in what you have allocated and what
is
>> > being
>> > > > > actually used. The dashboard shows what is allocated.
>> > > > >
>> > > > > Regards,
>> > > > > Bharat.
>> > > > >
>> > > > > On 11/21/16, 9:44 PM, "williamstevens@gmail.com on behalf of
Will
>> > > > > Stevens" <williamstevens@gmail.com on behalf of
>> > wstevens@cloudops.com>
>> > > > > wrote:
>> > > > >
>> > > > > >You will have to contact Accelerite for support with ACP
>> (previously
>> > > > CCP).
>> > > > > >We have no visibility into the ACP code or how to support
you.
>> > > > > >
>> > > > > >https://support.accelerite.com/hc/en-us
>> > > > > >
>> > > > > >Best of luck...
>> > > > > >
>> > > > > >*Will STEVENS*
>> > > > > >Lead Developer
>> > > > > >
>> > > > > ><https://goo.gl/NYZ8KK>
>> > > > > >
>> > > > > >On Mon, Nov 21, 2016 at 3:44 AM, anil lakineni <
>> > > > > >anilkumar459.lakineni@gmail.com> wrote:
>> > > > > >
>> > > > > >> Dear All,
>> > > > > >>
>> > > > > >> On CloudPlatform dashboard our CPU usage is showing
wrong (high
>> > > -91%)
>> > > > > value
>> > > > > >> which in-turn not allowing us to provision new VMs.
But, the
>> fact
>> > is
>> > > > > only
>> > > > > >> 40% of the available CPU is utilized and Even in the
Dashboard
>> > only
>> > > > > >> percentage calculation is showing false metric value,
But Cpu
>> > usage
>> > > > > value
>> > > > > >> is showing accurate(800/2000 GHZ).
>> > > > > >>
>> > > > > >> In addition to that when we go to check the CPU status
at Zones
>> > > level
>> > > > we
>> > > > > >> are seeing the accurate CPU usage percentage in all
Zones,
>> Only we
>> > > are
>> > > > > >> getting false usage percentage at dashboard level(which
leads
>> to
>> > > fail
>> > > > > the
>> > > > > >> new deployments).
>> > > > > >>
>> > > > > >> - Our CCP version is 4.5.0
>> > > > > >> - Hypervisors are Xen 6.2 & 6.5
>> > > > > >>
>> > > > > >> Please help me to sort out this issue and also let me
know if
>> any
>> > > > > >> additional information needed.
>> > > > > >>
>> > > > > >>
>> > > > > >> Best Regards,
>> > > > > >> Anil.
>> > > > > >>
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > DISCLAIMER
>> > > > > ==========
>> > > > > This e-mail may contain privileged and confidential information
>> which
>> > > is
>> > > > > the property of Accelerite, a Persistent Systems business. It
is
>> > > intended
>> > > > > only for the use of the individual or entity to which it is
>> > addressed.
>> > > If
>> > > > > you are not the intended recipient, you are not authorized to
>> read,
>> > > > retain,
>> > > > > copy, print, distribute or use this message. If you have received
>> > this
>> > > > > communication in error, please notify the sender and delete all
>> > copies
>> > > of
>> > > > > this message. Accelerite, a Persistent Systems business does
not
>> > accept
>> > > > any
>> > > > > liability for virus infected mails.
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message