cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrei Mikhailovsky <and...@arhont.com>
Subject Re: deleting or cancelling broken ACS jobs
Date Tue, 10 Jun 2014 16:06:43 GMT
Thanks, I will try that. 

Andrei 


----- Original Message -----

From: "Shweta Agarwal" <Shweta.Agarwal@citrix.com> 
To: users@cloudstack.apache.org 
Sent: Tuesday, 10 June, 2014 5:04:23 AM 
Subject: RE: deleting or cancelling broken ACS jobs 


Change these global parameters to very small value say 1 min 
job.cancel.threshold.minutes Time (in minutes) for async-jobs to be forcely cancelled if it
has been in process for long 
job.expire.minutes Time (in minutes) for async-jobs to be kept in system 

and then restart your management server 
wait for some time and asyn job will expire and then 
change back these value to original value and restart MS again. 

Hope this will help 

Thanks 
Shweta 



-----Original Message----- 
From: Andrei Mikhailovsky [mailto:andrei@arhont.com] 
Sent: Monday, June 09, 2014 6:53 PM 
To: users@cloudstack.apache.org 
Subject: deleting or cancelling broken ACS jobs 

Hello guys, 

was wondering if anyone have come across an issue where acs would get stuck on several jobs
and keeps trying to do them over and over again? 

I've come across an issue a few days ago. For some reason I have about 5 or 6 XenServer cluster
jobs which have gone crazy. These jobs are of different nature, like template creation, vm
start and enable host maintenance. 
They keep on repeating in the logs about 20-30 times a second, causing overfilling of logs.
I get about 20GB of management server logs each day and it seems that these stuck jobs are
causing the overflow. I am also not able to perform any activity on the XenServer cluster
which has those stuck jobs. I am unable to start or stop jobs or pretty much do anything with
it. 

I've tried restarting both the management server and the xenserver hosts, but that didn't
help. After a short while following a restart the same thing starts to happen. 

Is there a way for ACS to cancel / remove these jobs? I've looked at the async_job and async_job_view
db tables and I can see 28 entries there amongst which are these stuck jobs gone crazy. Is
it safe for me to simply remove them from the database and restart the management server?
Are there any other db tables that I should look at? 

Many thanks 

Andrei 






Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message