infra-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tibor Digana (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (INFRA-16951) Unstable Jenkins. Aborted build. No disk space. Lost connection between slave and master.
Date Mon, 17 Sep 2018 02:23:00 GMT

    [ https://issues.apache.org/jira/browse/INFRA-16951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16616995#comment-16616995
] 

Tibor Digana edited comment on INFRA-16951 at 9/17/18 2:22 AM:
---------------------------------------------------------------

[~cml]
So that Java tells me there's 1 CPU.
I am not talking about "now". I only say that it is overloaded when two executors are busy.
I can clearly see what I had to do with sleep() functions, prolonged, because system time
consumed a lot more than user time in the integration tests.
I have switch-case in the code and the sleeps and executed times in logs correspond with 1
CPU per Java process:
TimeUnit.MILLISECONDS.sleep( Runtime.getRuntime().availableProcessors() == 1 ? 9000 : 3750
);
When you compare Linux and Windows execution, you see that Surefire's job takes 1hour on Linux
but 1.5 hour on Windows. Then why my personal PC (Win7 i7 CPU 4 Core) executes the same build
within 1 hour?


was (Author: tibor17):
[~cml]
So that Java tells me there's 1 CPU.
I am not talking about "now". I only say that it is overloaded when two executors are busy.
I can clearly see what I had to do with sleep() functions, prolonged, because system time
consumed a lot more than user time in the integration tests.
I have switch-case in the code and the sleeps and executed times in logs correspond with 1
CPU per Java process:
TimeUnit.MILLISECONDS.sleep( Runtime.getRuntime().availableProcessors() == 1 ? 9000 : 3750
);

> Unstable Jenkins. Aborted build. No disk space. Lost connection between slave and master.
> -----------------------------------------------------------------------------------------
>
>                 Key: INFRA-16951
>                 URL: https://issues.apache.org/jira/browse/INFRA-16951
>             Project: Infrastructure
>          Issue Type: Bug
>          Components: Jenkins
>         Environment: Jenkins Windows/Linus slaves aborted build
>            Reporter: Tibor Digana
>            Assignee: Gavin
>            Priority: Major
>
> My build is still unstable on h/w issues.
> Windows and Linux machines down and no permission to open file (1GB of files)
> The worst was observed from the last build where I lost connection with slaves for 5
minutes and it looks like somebody or something aborted my build. The h/w issue is additional
issue:
> java.nio.file.FileSystemException: /x1/jenkins/jenkins-home/jobs/maven-box/jobs/maven-surefire/branches/INV1561/builds/24/archive/surefire-its--windows-jdk7-maven3.5.x.zip:
Too many open files
> Pls see this full log
> https://builds.apache.org/job/maven-box/job/maven-surefire/job/INV1561/24/consoleFull
> If you want to know which machines were used, please look up for the string "Running
on" in the console log.
> Can you find out who aborted my build?
> It usually takes less than two hours to complete and uses 8 executors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message