www-builds mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dennis Lundberg <denn...@apache.org>
Subject Re: Windows slaves (1 and 2) offline
Date Mon, 14 Apr 2014 19:37:39 GMT
I have just killed the following jobs on windows1, they had been stuck
for 23+ hours:
1. https://builds.apache.org/job/Qpid-Java-Java-BDB-TestMatrix/
2. https://builds.apache.org/job/river-qa-refactor-win6/
3. https://builds.apache.org/job/ZooKeeper-trunk-WinVS2008_java/

Together they were effectively blocking all other projects that needed
a windows slave.

The problem with 1 is that it is triggered by
https://builds.apache.org/job/Qpid-Java-Java-MMS-TestMatrix
which in turn is on a periodical schedule (once a day, 0 9 * * *) as
well as an SCM poll schedule (once every 15 minutes, */15 * * * *)

The same problem goes for 3 which is on a periodical schedule (once a
day, 30 8 * * *)

In my opinion we should not allow periodical schedules.

On Sun, Apr 13, 2014 at 10:46 AM, Gavin McDonald <gavin@16degrees.com.au> wrote:
> Managed to kill 3 of them, looking into why.
>
> Gav…
>
> On 13/04/2014, at 7:01 AM, Erik de Bruin <erik@ixsoftware.nl> wrote:
>
>> Currently there are 4 builds stuck on the windows1 slave. They seem to have
>> stopped on the SCM step right at the beginning of their builds.
>>
>> Can you please take a look?
>>
>> EdB
>>
>>
>>
>>
>> On Fri, Apr 11, 2014 at 4:43 PM, Alex Harui <aharui@adobe.com> wrote:
>>
>>> Hi Jake,
>>>
>>> Thanks for restarting.  I can't help but wonder if there is still some
>>> configuration issue with Jenkins and Git that is causing Windows1 to run
>>> out of memory.  Is there an investigation going on in that regard?
>>>
>>> Thanks,
>>> -Alex
>>>
>>> On 4/11/14 7:38 AM, "Jake Farrell" <jfarrell@apache.org> wrote:
>>>
>>>> Hey Erik
>>>> Windows1 ran out of memory, restarted and builds in the queue have been
>>>> picked up and are running
>>>>
>>>> -Jake
>>>>
>>>>
>>>> On Fri, Apr 11, 2014 at 10:17 AM, Erik de Bruin <erik@ixsoftware.nl>
>>>> wrote:
>>>>
>>>>> Same week, second time... The 'windows1' slave is offline. There are
>>>>> builds that have been in the queue for over 12 hours, so it's not
>>>>> 'idling'.
>>>>>
>>>>> Can someone look at this, please?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> EdB
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Apr 8, 2014 at 1:08 AM, David Nalley <david@gnsa.us> wrote:
>>>>>
>>>>>> Jan and I discussed this briefly at ApacheCon and are tossing around
>>>>>> the idea of having Circonus monitor the status of the slave (according
>>>>>> to Jenkins) and perhaps to take corrective action automagically.
We're
>>>>>> going to continue to think and work on this. Neither of us have admin
>>>>>> privs on the Window's slaves, so we'd want folks that do (and are
thus
>>>>>> responsible for maintaining them) to bless this approach.
>>>>>>
>>>>>> --David
>>>>>>
>>>>>>
>>>>>> On Mon, Apr 7, 2014 at 11:17 AM, Alex Harui <aharui@adobe.com>
wrote:
>>>>>>> Hi Jake,
>>>>>>>
>>>>>>> Is there some way you could create a "button" that we could hit
to
>>>>>> restart
>>>>>>> the Windows slave so we don't have to keep bothering you?  Or
does it
>>>>>>> require human intervention to get it to come back up?
>>>>>>>
>>>>>>> Maybe some script we can get at from people.a.o, or a custom
Jenkins
>>>>>> task
>>>>>>> that we kick, or a button on the wiki that runs some script code?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> -Alex
>>>>>>>
>>>>>>> On 4/7/14 8:13 AM, "Erik de Bruin" <erik@ixsoftware.nl>
wrote:
>>>>>>>
>>>>>>>> Good news.
>>>>>>>>
>>>>>>>> Excellent service, thank you!
>>>>>>>>
>>>>>>>> EdB
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Apr 7, 2014 at 4:22 PM, Jake Farrell <jfarrell@apache.org>
>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hey Erik
>>>>>>>>> I just restarted windows 1 and it has picked up the Apache
Flex
>>>>>> build
>>>>>>>>> and
>>>>>>>>> is running it right now.
>>>>>>>>>
>>>>>>>>> -Jake
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Apr 7, 2014 at 10:08 AM, Erik de Bruin <erik@ixsoftware.nl
>>>>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> This is becoming a weekly event... both 'windows'
slaves are
>>>>>> offline,
>>>>>>>>>> again.
>>>>>>>>>>
>>>>>>>>>> You might want to seriously consider accepting the
offers to help
>>>>>> from
>>>>>>>>>> the friendly people in the "volunteering for ASF
Jenkins farm
>>>>>> service
>>>>>>>>>> maintenance" thread.
>>>>>>>>>>
>>>>>>>>>> EdB
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Apr 3, 2014 at 7:22 PM, Jake Farrell <jfarrell@apache.org
>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> restarted, builds should start getting picked
up shortly
>>>>>>>>>>>
>>>>>>>>>>> -Jake
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Apr 3, 2014 at 1:05 PM, Erik de Bruin
>>>>>> <erik@ixsoftware.nl>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> Both Windows slaves seem to be offline. There
are several
>>>>>> 'windows'
>>>>>>>>>>> builds
>>>>>>>>>>>> in the queue, so it seems they are not simply
idling. Can you
>>>>>> please
>>>>>>>>>>> take a
>>>>>>>>>>>> look?
>>>>>>>>>>>>
>>>>>>>>>>>> EdB
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Apr 1, 2014 at 9:20 AM, Jake Farrell
>>>>>> <jfarrell@apache.org
>>>>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hey Justin
>>>>>>>>>>>>> The builds look like they are working,
now sure why java is
>>>>>> giving
>>>>>>>>>>> you
>>>>>>>>>>>>> that
>>>>>>>>>>>>> error for the latest java path since
>>>>>>>>>>>>> /f/hudson/tools/java/latest-1.6-64/jre/bin/java.exe
-version
>>>>>> gives
>>>>>>>>>>> me
>>>>>>>>>>> a
>>>>>>>>>>>>> print out of 1.6.0_27. if you wouldnt
mind creating a ticket
>>>>>> for
>>>>>>>>>>> this
>>>>>>>>>>> so
>>>>>>>>>>>>> someone can investigate it I would appreciate
it, its 3am for
>>>>>> me
>>>>>>>>>>> and I
>>>>>>>>>>>>> need
>>>>>>>>>>>>> to call it a night
>>>>>>>>>>>>>
>>>>>>>>>>>>> -Jake
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Apr 1, 2014 at 3:09 AM, Justin
Mclean <
>>>>>>>>>>> justin@classsoftware.com
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Flex-sdk_1 and flex-sdk_release
fixed and started, looking
>>>>>>>>>>> through the
>>>>>>>>>>>>>>> other flex builds now
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>> https://builds.apache.org/view/E-G/view/Flex/job/flex-sdk_1/60/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>> https://builds.apache.org/view/E-G/view/Flex/job/flex-sdk_release/539/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> While it looks like they are compiling
I noticed this:
>>>>>>>>>>>>>> java.io.IOException: Cannot run program
>>>>>>>>>>>>>> "f:\hudson\tools\java\latest-1.6-64\jre\bin\java.exe
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So look like the version of java
it expects to use is
>>>>>> missing??
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Justin
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Ix Multimedia Software
>>>>>>>>>>>>
>>>>>>>>>>>> Jan Luykenstraat 27
>>>>>>>>>>>> 3521 VB Utrecht
>>>>>>>>>>>>
>>>>>>>>>>>> T. 06-51952295
>>>>>>>>>>>> I. www.ixsoftware.nl
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Ix Multimedia Software
>>>>>>>>>>
>>>>>>>>>> Jan Luykenstraat 27
>>>>>>>>>> 3521 VB Utrecht
>>>>>>>>>>
>>>>>>>>>> T. 06-51952295
>>>>>>>>>> I. www.ixsoftware.nl
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Ix Multimedia Software
>>>>>>>>
>>>>>>>> Jan Luykenstraat 27
>>>>>>>> 3521 VB Utrecht
>>>>>>>>
>>>>>>>> T. 06-51952295
>>>>>>>> I. www.ixsoftware.nl
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Ix Multimedia Software
>>>>>
>>>>> Jan Luykenstraat 27
>>>>> 3521 VB Utrecht
>>>>>
>>>>> T. 06-51952295
>>>>> I. www.ixsoftware.nl
>>>>>
>>>
>>>
>>
>>
>> --
>> Ix Multimedia Software
>>
>> Jan Luykenstraat 27
>> 3521 VB Utrecht
>>
>> T. 06-51952295
>> I. www.ixsoftware.nl
>



-- 
Dennis Lundberg

Mime
View raw message