www-builds mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gavin McDonald" <ga...@16degrees.com.au>
Subject RE: [Jenkins] poor handling of offline slaves
Date Fri, 01 Jun 2012 03:46:31 GMT

> -----Original Message-----
> From: Kristian Waagan [mailto:kristian.waagan@oracle.com]
> Sent: Thursday, 31 May 2012 1:45 AM
> To: builds@apache.org
> Subject: [Jenkins] poor handling of offline slaves
> Hi,
> Currently there are several jobs that have been hanging on a Linux executor
> for several days because windows1 is offline.

I've fixed the disk space issue by:

1. Clearing out some junk from Maven and/or poorly configured jobs that don’t
Clean up their workspaces.

2. I added a 80GB disk to replace the 40GB one.

>  In addition, there are a bunch
> of jobs that have been in the queue for days.

They will catch up.

> It appears that Jenkins lets the "multi OS" jobs wait for a very long time
> before giving up on waiting for a slave. A few questions:
>   a) Is it possible to have Jenkins fail a job already occupying an executor slot if
> it has to wait for too long?

If it is occupying an executor that means the build is running and/or stuck.
If stuck they can be configured to die after a while. With Windows builds this 
Does not always work.

>   b) There's only one windows slave. Are there any plans to add another
> Windows slave (preferably on a different box than windows1)?

Not currently. When running well, there is never much of a queue demand for it.
Let it catch up and we'll review the situation again in a week.

> If many projects are configured to run on multiple operating systems, of
> which two have only one slave (Windows and Solaris), these projects may
> cause jobs to pile up on Linux. Maybe there are other mechanisms in place to
> deal with this, I don't know.

Not sure what you mean, jobs run independent of each other on multiple slaves.

> There are currently two other jobs [1]  that have been hanging for two days
> or more, but there seems to be enough Linux executors to serve other jobs
> reasonably fast. For that reason I have left them alone for the time being.

I'll delete those.


> Thanks,
> --
> Kristian
> [1] https://builds.apache.org/job/Ant-Build-Matrix/ and
> https://builds.apache.org/job/Empire-db%20multios/

View raw message