infra-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "yifan zou (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (INFRA-18020) Dump beam jenkins VM resource usage
Date Fri, 15 Mar 2019 23:34:00 GMT

     [ https://issues.apache.org/jira/browse/INFRA-18020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

yifan zou updated INFRA-18020:
------------------------------
    Description: 
The beam jenkins agents are falling in troubles very frequent. We now seeing beam 4, 7 and
13 are disconnected. The errors we found from the Jenkins console are similar. Since we don't
have access to those machines, can anyone from Infra side help us to dump the resource usage
such as memory, disk and cpu, etc? We want to understand how these problem happened. 

*Beam4: beam_precommit_java_commit #4638*
07:09:35 # There is insufficient memory for the Java Runtime Environment to continue.
07:09:35 # Cannot create GC thread. Out of system resources.
07:09:35 # Possible reasons:
07:09:35 #   The system is out of physical RAM or swap space
07:09:35 #   In 32 bit mode, the process size limit was hit
07:09:35 # Possible solutions:
07:09:35 #   Reduce memory load on the system
07:09:35 #   Increase physical memory or swap space
07:09:35 #   Check if swap backing store is full
07:09:35 #   Use 64 bit Java on a 64 bit OS
07:09:35 #   Decrease Java heap size (-Xmx/-Xms)
07:09:35 #   Decrease number of Java threads
07:09:35 #   Decrease Java thread stack sizes (-Xss)
07:09:35 #   Set larger code cache with -XX:ReservedCodeCacheSize=
07:09:35 # This output file may be truncated or incomplete.
07:09:35 #
07:09:35 #  Out of Memory Error (gcTaskThread.cpp:48), pid=19780, tid=0x00007f39b241b700
07:09:35 #
07:09:35 # JRE version:  (8.0_191-b12) (build )
07:09:35 # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.191-b12 mixed mode linux-amd64 compressed
oops)
07:09:35 # Failed to write core dump. Core dumps have been disabled. To enable core dumping,
try "ulimit -c unlimited" before starting Java again
07:09:35 #



*Beam4 beam_preCommit_Python_PVR_Flink_Commit #1065*
Installing collected packages: crcmod, dill, fastavro, docopt, certifi, chardet, idna, urllib3,
requests, hdfs, httplib2, pbr, funcsigs, mock, pyasn1, pyasn1-modules, rsa, oauth2client,
pyparsing, pydot, pytz, pyyaml, avro, pyvcf, typing, numpy, pyarrow, nose, python-dateutil,
pandas, parameterized, pyhamcrest, monotonic, tenacity, apache-beam
07:10:00  Running setup.py develop for apache-beam
07:10:00    Error [Errno 28] No space left on device while executing command /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_PVR_Flink_Commit/src/build/gradleenv/1327086738/bin/python2
-c "import setuptools, tokenize;__file__='/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_PVR_Flink_Commit/src/sdks/python/setup.py';f=getattr(tokenize,
'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__,
'exec'))" develop --no-deps
07:10:00 Could not install packages due to an EnvironmentError: [Errno 28] No space left on
device
07:10:00

*Beam7 beam_PreCommit_Java_Commit #4783*
Error occurred during initialization of VM
13:41:33 java.lang.OutOfMemoryError: unable to create new native thread
13:41:33 #
13:41:33 # There is insufficient memory for the Java Runtime Environment to continue.
13:41:33 # Cannot create GC thread. Out of system resources.
13:41:33 # An error report file with more information is saved as:
13:41:33 # /home/jenkins/.gradle/workers/hs_err_pid22438.log
13:41:33 Could not write standard input to Gradle Worker Daemon 17.
13:41:33 java.io.IOException: Broken pipe
13:41:33 	at java.io.FileOutputStream.writeBytes(Native Method)
13:41:33 	at java.io.FileOutputStream.write(FileOutputStream.java:326)
13:41:33 	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
13:41:33 	at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
13:41:33 	at org.gradle.process.internal.streams.ExecOutputHandleRunner.forwardContent(ExecOutputHandleRunner.java:67)
13:41:33 	at org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputHandleRunner.java:52)
13:41:33 	at org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:63)
13:41:33 	at org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:46)
13:41:33 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
13:41:33 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
13:41:33 	at org.gradle.internal.concurrent.ThreadFactoryImpl$ManagedThreadRunnable.run(ThreadFactoryImpl.java:55)
13:41:33 	at java.lang.Thread.run(Thread.java:748)



*Beam13 beam_PreCommit_Java_Commit #4756*
11:22:50 Caused by: java.lang.OutOfMemoryError: unable to create new native thread
11:22:50 	at java.lang.Thread.start0(Native Method)
11:22:50 	at java.lang.Thread.start(Thread.java:717)
11:22:50 	at org.gradle.util.DisconnectableInputStream$ThreadExecuter.execute(DisconnectableInputStream.java:50)
11:22:50 	at org.gradle.util.DisconnectableInputStream$ThreadExecuter.execute(DisconnectableInputStream.java:45)
11:22:50 	at org.gradle.util.DisconnectableInputStream.<init>(DisconnectableInputStream.java:130)
11:22:50 	at org.gradle.util.DisconnectableInputStream.<init>(DisconnectableInputStream.java:59)
11:22:50 	at org.gradle.util.DisconnectableInputStream.<init>(DisconnectableInputStream.java:55)
11:22:50 	at org.gradle.process.internal.streams.ForwardStdinStreamsHandler.connectStreams(ForwardStdinStreamsHandler.java:51)
11:22:50 	at org.gradle.process.internal.DefaultExecHandle$CompositeStreamsHandler.connectStreams(DefaultExecHandle.java:417)
11:22:50 	at org.gradle.process.internal.ExecHandleRunner.startProcess(ExecHandleRunner.java:98)
11:22:50 	at org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:70)
11:22:50 	... 7 more


  was:
The beam jenkins agents are falling in troubles very frequent. We now seeing beam 4, 7 and
13 are disconnected. The errors we found from the Jenkins console are similar. Since we don't
have access to those machines, can anyone from Infra side help us to dump the resource usage
such as memory, disk and cpu, etc? We want to understand how these problem happened. 

Beam4: beam_precommit_java_commit #4638
07:09:35 # There is insufficient memory for the Java Runtime Environment to continue.
07:09:35 # Cannot create GC thread. Out of system resources.
07:09:35 # Possible reasons:
07:09:35 #   The system is out of physical RAM or swap space
07:09:35 #   In 32 bit mode, the process size limit was hit
07:09:35 # Possible solutions:
07:09:35 #   Reduce memory load on the system
07:09:35 #   Increase physical memory or swap space
07:09:35 #   Check if swap backing store is full
07:09:35 #   Use 64 bit Java on a 64 bit OS
07:09:35 #   Decrease Java heap size (-Xmx/-Xms)
07:09:35 #   Decrease number of Java threads
07:09:35 #   Decrease Java thread stack sizes (-Xss)
07:09:35 #   Set larger code cache with -XX:ReservedCodeCacheSize=
07:09:35 # This output file may be truncated or incomplete.
07:09:35 #
07:09:35 #  Out of Memory Error (gcTaskThread.cpp:48), pid=19780, tid=0x00007f39b241b700
07:09:35 #
07:09:35 # JRE version:  (8.0_191-b12) (build )
07:09:35 # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.191-b12 mixed mode linux-amd64 compressed
oops)
07:09:35 # Failed to write core dump. Core dumps have been disabled. To enable core dumping,
try "ulimit -c unlimited" before starting Java again
07:09:35 #



Beam4 beam_preCommit_Python_PVR_Flink_Commit #1065
Installing collected packages: crcmod, dill, fastavro, docopt, certifi, chardet, idna, urllib3,
requests, hdfs, httplib2, pbr, funcsigs, mock, pyasn1, pyasn1-modules, rsa, oauth2client,
pyparsing, pydot, pytz, pyyaml, avro, pyvcf, typing, numpy, pyarrow, nose, python-dateutil,
pandas, parameterized, pyhamcrest, monotonic, tenacity, apache-beam
07:10:00  Running setup.py develop for apache-beam
07:10:00    Error [Errno 28] No space left on device while executing command /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_PVR_Flink_Commit/src/build/gradleenv/1327086738/bin/python2
-c "import setuptools, tokenize;__file__='/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_PVR_Flink_Commit/src/sdks/python/setup.py';f=getattr(tokenize,
'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__,
'exec'))" develop --no-deps
07:10:00 Could not install packages due to an EnvironmentError: [Errno 28] No space left on
device
07:10:00

Beam7 beam_PreCommit_Java_Commit #4783
Error occurred during initialization of VM
13:41:33 java.lang.OutOfMemoryError: unable to create new native thread
13:41:33 #
13:41:33 # There is insufficient memory for the Java Runtime Environment to continue.
13:41:33 # Cannot create GC thread. Out of system resources.
13:41:33 # An error report file with more information is saved as:
13:41:33 # /home/jenkins/.gradle/workers/hs_err_pid22438.log
13:41:33 Could not write standard input to Gradle Worker Daemon 17.
13:41:33 java.io.IOException: Broken pipe
13:41:33 	at java.io.FileOutputStream.writeBytes(Native Method)
13:41:33 	at java.io.FileOutputStream.write(FileOutputStream.java:326)
13:41:33 	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
13:41:33 	at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
13:41:33 	at org.gradle.process.internal.streams.ExecOutputHandleRunner.forwardContent(ExecOutputHandleRunner.java:67)
13:41:33 	at org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputHandleRunner.java:52)
13:41:33 	at org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:63)
13:41:33 	at org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:46)
13:41:33 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
13:41:33 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
13:41:33 	at org.gradle.internal.concurrent.ThreadFactoryImpl$ManagedThreadRunnable.run(ThreadFactoryImpl.java:55)
13:41:33 	at java.lang.Thread.run(Thread.java:748)



Beam13 beam_PreCommit_Java_Commit #4756
11:22:50 Caused by: java.lang.OutOfMemoryError: unable to create new native thread
11:22:50 	at java.lang.Thread.start0(Native Method)
11:22:50 	at java.lang.Thread.start(Thread.java:717)
11:22:50 	at org.gradle.util.DisconnectableInputStream$ThreadExecuter.execute(DisconnectableInputStream.java:50)
11:22:50 	at org.gradle.util.DisconnectableInputStream$ThreadExecuter.execute(DisconnectableInputStream.java:45)
11:22:50 	at org.gradle.util.DisconnectableInputStream.<init>(DisconnectableInputStream.java:130)
11:22:50 	at org.gradle.util.DisconnectableInputStream.<init>(DisconnectableInputStream.java:59)
11:22:50 	at org.gradle.util.DisconnectableInputStream.<init>(DisconnectableInputStream.java:55)
11:22:50 	at org.gradle.process.internal.streams.ForwardStdinStreamsHandler.connectStreams(ForwardStdinStreamsHandler.java:51)
11:22:50 	at org.gradle.process.internal.DefaultExecHandle$CompositeStreamsHandler.connectStreams(DefaultExecHandle.java:417)
11:22:50 	at org.gradle.process.internal.ExecHandleRunner.startProcess(ExecHandleRunner.java:98)
11:22:50 	at org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:70)
11:22:50 	... 7 more



> Dump beam jenkins VM resource usage
> -----------------------------------
>
>                 Key: INFRA-18020
>                 URL: https://issues.apache.org/jira/browse/INFRA-18020
>             Project: Infrastructure
>          Issue Type: Bug
>          Components: Jenkins
>            Reporter: yifan zou
>            Priority: Major
>
> The beam jenkins agents are falling in troubles very frequent. We now seeing beam 4,
7 and 13 are disconnected. The errors we found from the Jenkins console are similar. Since
we don't have access to those machines, can anyone from Infra side help us to dump the resource
usage such as memory, disk and cpu, etc? We want to understand how these problem happened.

> *Beam4: beam_precommit_java_commit #4638*
> 07:09:35 # There is insufficient memory for the Java Runtime Environment to continue.
> 07:09:35 # Cannot create GC thread. Out of system resources.
> 07:09:35 # Possible reasons:
> 07:09:35 #   The system is out of physical RAM or swap space
> 07:09:35 #   In 32 bit mode, the process size limit was hit
> 07:09:35 # Possible solutions:
> 07:09:35 #   Reduce memory load on the system
> 07:09:35 #   Increase physical memory or swap space
> 07:09:35 #   Check if swap backing store is full
> 07:09:35 #   Use 64 bit Java on a 64 bit OS
> 07:09:35 #   Decrease Java heap size (-Xmx/-Xms)
> 07:09:35 #   Decrease number of Java threads
> 07:09:35 #   Decrease Java thread stack sizes (-Xss)
> 07:09:35 #   Set larger code cache with -XX:ReservedCodeCacheSize=
> 07:09:35 # This output file may be truncated or incomplete.
> 07:09:35 #
> 07:09:35 #  Out of Memory Error (gcTaskThread.cpp:48), pid=19780, tid=0x00007f39b241b700
> 07:09:35 #
> 07:09:35 # JRE version:  (8.0_191-b12) (build )
> 07:09:35 # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.191-b12 mixed mode linux-amd64
compressed oops)
> 07:09:35 # Failed to write core dump. Core dumps have been disabled. To enable core dumping,
try "ulimit -c unlimited" before starting Java again
> 07:09:35 #
> *Beam4 beam_preCommit_Python_PVR_Flink_Commit #1065*
> Installing collected packages: crcmod, dill, fastavro, docopt, certifi, chardet, idna,
urllib3, requests, hdfs, httplib2, pbr, funcsigs, mock, pyasn1, pyasn1-modules, rsa, oauth2client,
pyparsing, pydot, pytz, pyyaml, avro, pyvcf, typing, numpy, pyarrow, nose, python-dateutil,
pandas, parameterized, pyhamcrest, monotonic, tenacity, apache-beam
> 07:10:00  Running setup.py develop for apache-beam
> 07:10:00    Error [Errno 28] No space left on device while executing command /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_PVR_Flink_Commit/src/build/gradleenv/1327086738/bin/python2
-c "import setuptools, tokenize;__file__='/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_PVR_Flink_Commit/src/sdks/python/setup.py';f=getattr(tokenize,
'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__,
'exec'))" develop --no-deps
> 07:10:00 Could not install packages due to an EnvironmentError: [Errno 28] No space left
on device
> 07:10:00
> *Beam7 beam_PreCommit_Java_Commit #4783*
> Error occurred during initialization of VM
> 13:41:33 java.lang.OutOfMemoryError: unable to create new native thread
> 13:41:33 #
> 13:41:33 # There is insufficient memory for the Java Runtime Environment to continue.
> 13:41:33 # Cannot create GC thread. Out of system resources.
> 13:41:33 # An error report file with more information is saved as:
> 13:41:33 # /home/jenkins/.gradle/workers/hs_err_pid22438.log
> 13:41:33 Could not write standard input to Gradle Worker Daemon 17.
> 13:41:33 java.io.IOException: Broken pipe
> 13:41:33 	at java.io.FileOutputStream.writeBytes(Native Method)
> 13:41:33 	at java.io.FileOutputStream.write(FileOutputStream.java:326)
> 13:41:33 	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> 13:41:33 	at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> 13:41:33 	at org.gradle.process.internal.streams.ExecOutputHandleRunner.forwardContent(ExecOutputHandleRunner.java:67)
> 13:41:33 	at org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputHandleRunner.java:52)
> 13:41:33 	at org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:63)
> 13:41:33 	at org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:46)
> 13:41:33 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 13:41:33 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> 13:41:33 	at org.gradle.internal.concurrent.ThreadFactoryImpl$ManagedThreadRunnable.run(ThreadFactoryImpl.java:55)
> 13:41:33 	at java.lang.Thread.run(Thread.java:748)
> *Beam13 beam_PreCommit_Java_Commit #4756*
> 11:22:50 Caused by: java.lang.OutOfMemoryError: unable to create new native thread
> 11:22:50 	at java.lang.Thread.start0(Native Method)
> 11:22:50 	at java.lang.Thread.start(Thread.java:717)
> 11:22:50 	at org.gradle.util.DisconnectableInputStream$ThreadExecuter.execute(DisconnectableInputStream.java:50)
> 11:22:50 	at org.gradle.util.DisconnectableInputStream$ThreadExecuter.execute(DisconnectableInputStream.java:45)
> 11:22:50 	at org.gradle.util.DisconnectableInputStream.<init>(DisconnectableInputStream.java:130)
> 11:22:50 	at org.gradle.util.DisconnectableInputStream.<init>(DisconnectableInputStream.java:59)
> 11:22:50 	at org.gradle.util.DisconnectableInputStream.<init>(DisconnectableInputStream.java:55)
> 11:22:50 	at org.gradle.process.internal.streams.ForwardStdinStreamsHandler.connectStreams(ForwardStdinStreamsHandler.java:51)
> 11:22:50 	at org.gradle.process.internal.DefaultExecHandle$CompositeStreamsHandler.connectStreams(DefaultExecHandle.java:417)
> 11:22:50 	at org.gradle.process.internal.ExecHandleRunner.startProcess(ExecHandleRunner.java:98)
> 11:22:50 	at org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:70)
> 11:22:50 	... 7 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message