www-builds mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicolas Lalevée <nicolas.lale...@hibnet.org>
Subject Re: [Jenkins] poor handling of offline slaves
Date Fri, 01 Jun 2012 12:54:10 GMT

Le 1 juin 2012 à 10:45, Kristian Waagan a écrit :

> On 01.06.12 10:35, Nicolas Lalevée wrote:
>> Le 1 juin 2012 à 10:03, Kristian Waagan a écrit :
>>> On 01.06.12 05:46, Gavin McDonald wrote:
>>>>>>  If many projects are configured to run on multiple operating systems,
>>>>>>  which two have only one slave (Windows and Solaris), these projects
>>>>>>  cause jobs to pile up on Linux. Maybe there are other mechanisms
in place to
>>>>>>  deal with this, I don't know.
>>>> Not sure what you mean, jobs run independent of each other on multiple slaves.
>>> From what I could see, jobs configured to run on multiple slaves using the "Configuration
Matrix" plugin/feature will hang on to the current slave while waiting for the next one. For
instance, commons-vfs-trunk had been running for five days and was occupying one executor
on ubuntuX while waiting for windows1 to become available. The timeout was set to 188 minutes,
so waiting for the next slave doesn't seem to count as being stuck.
>>> The two other jobs I mentioned are also using the Configuration Matrix feature.
>>> Of course, this will only be a problem if the system is overloaded, or a slave,
or group of slaves, is off line for a longer period of time and these jobs eat up the executor
slots on the healthy slaves.
>> A "Matrix" job is not consuming any executor actually. It only trigger jobs and monitor
then. Notice how Jenkins is displaying them while they are running, they are not in the first
two boxes of a slave (the executor slots), they are in a extra one.
> Ah, I see.
> Thanks for that explanation, Nicolas.
> That only leaves why the job doesn't time out, but maybe that's as designed too?

I don't know.
I think they should time out too, so the job maintainers get notified.


View raw message