flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel <ele...@msn.com>
Subject GC on taskmanagers
Date Tue, 31 Mar 2015 03:44:06 GMT
My Java is still rusty and I often run into OutOfMemoryError: GC overhead exceeded...
Yes, I need to look for memory leaks...
But first I need to clear up this memory so I can run again without having to shut down and
restart everything.
I've tried using the jcmd <pid> GC.run command on eachof the JVM instances on a taskmanager
but I get a boat load of output like this:
On the host running the command:com.sun.tools.attach.AttachNotSupportedException: Unable to
open socket file: target process not responding or HotSpot VM not loaded	at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:106)
at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:63)
at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:213)	at sun.tools.jcmd.JCmd.executeCommandForPid(JCmd.java:140)
at sun.tools.jcmd.JCmd.main(JCmd.java:129)


and on the taskmanager log:
"Flink-IPC Server handler 1 on 6121" daemon prio=10 tid=0x00007f5f107ee000 nid=0x8f waiting
on condition [0x00007f5eb4803000]   java.lang.Thread.State: WAITING (parking)	at sun.misc.Unsafe.park(Native
Method)	- parking to wait for  <0x00000000f37e95c0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)	at org.apache.flink.runtime.ipc.Server$Handler.run(Server.java:941)
"Flink-IPC Server handler 0 on 6121" daemon prio=10 tid=0x00007f5f107eb800 nid=0x8e waiting
on condition [0x00007f5eb4904000]   java.lang.Thread.State: WAITING (parking)	at sun.misc.Unsafe.park(Native
Method)	- parking to wait for  <0x00000000f37e95c0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)	at org.apache.flink.runtime.ipc.Server$Handler.run(Server.java:941)
"Flink-IPC Server listener on 6121" daemon prio=10 tid=0x00007f5f107e9800 nid=0x8d runnable
[0x00007f5eb4a05000]   java.lang.Thread.State: RUNNABLE	at sun.nio.ch.EPollArrayWrapper.epollWait(Native
Method)	at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)	at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)	- locked <0x00000000f385d3c0>
(a sun.nio.ch.Util$2)	- locked <0x00000000f385d3d0> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000000f385d378> (a sun.nio.ch.EPollSelectorImpl)	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:102)	at org.apache.flink.runtime.ipc.Server$Listener.run(Server.java:341)
"Flink-IPC Server Responder" daemon prio=10 tid=0x00007f5f107e8800 nid=0x8c runnable [0x00007f5eb4b06000]
  java.lang.Thread.State: RUNNABLE	at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)	at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)	- locked <0x00000000f387b528>
(a sun.nio.ch.Util$2)	- locked <0x00000000f387b538> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000000f387b4e0> (a sun.nio.ch.EPollSelectorImpl)	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
at org.apache.flink.runtime.ipc.Server$Responder.run(Server.java:506)
"Service Thread" daemon prio=10 tid=0x00007f5f100c2000 nid=0x8a runnable [0x0000000000000000]
  java.lang.Thread.State: RUNNABLE
"C2 CompilerThread1" daemon prio=10 tid=0x00007f5f100c0000 nid=0x89 waiting on condition [0x0000000000000000]
  java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" daemon prio=10 tid=0x00007f5f100bd000 nid=0x88 waiting on condition [0x0000000000000000]
  java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" daemon prio=10 tid=0x00007f5f100b3000 nid=0x87 waiting on condition [0x0000000000000000]
  java.lang.Thread.State: RUNNABLE
"Finalizer" daemon prio=10 tid=0x00007f5f1009c800 nid=0x86 in Object.wait() [0x00007f5eb605b000]
  java.lang.Thread.State: WAITING (on object monitor)	at java.lang.Object.wait(Native Method)
- waiting on <0x00000000f381cc08> (a java.lang.ref.ReferenceQueue$Lock)	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
- locked <0x00000000f381cc08> (a java.lang.ref.ReferenceQueue$Lock)	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:189)
"Reference Handler" daemon prio=10 tid=0x00007f5f10098800 nid=0x85 in Object.wait() [0x00007f5eb615c000]
  java.lang.Thread.State: WAITING (on object monitor)	at java.lang.Object.wait(Native Method)
- waiting on <0x00000000f381c820> (a java.lang.ref.Reference$Lock)	at java.lang.Object.wait(Object.java:503)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)	- locked <0x00000000f381c820>
(a java.lang.ref.Reference$Lock)
"main" prio=10 tid=0x00007f5f1000d800 nid=0x6a in Object.wait() [0x00007f5f178d4000]   java.lang.Thread.State:
WAITING (on object monitor)	at java.lang.Object.wait(Native Method)	- waiting on <0x00000000fbe14200>
(a java.lang.Object)	at java.lang.Object.wait(Object.java:503)	at org.apache.flink.runtime.taskmanager.TaskManager.main(TaskManager.java:1115)
- locked <0x00000000fbe14200> (a java.lang.Object)
"VM Thread" prio=10 tid=0x00007f5f10096000 nid=0x84 runnable
"GC task thread#0 (ParallelGC)" prio=10 tid=0x00007f5f10023000 nid=0x6b runnable
"GC task thread#1 (ParallelGC)" prio=10 tid=0x00007f5f10025000 nid=0x6c runnable
"GC task thread#2 (ParallelGC)" prio=10 tid=0x00007f5f10027000 nid=0x6d runnable
"GC task thread#3 (ParallelGC)" prio=10 tid=0x00007f5f10029000 nid=0x6e runnable
"GC task thread#4 (ParallelGC)" prio=10 tid=0x00007f5f1002a800 nid=0x6f runnable
"GC task thread#5 (ParallelGC)" prio=10 tid=0x00007f5f1002c800 nid=0x70 runnable
"GC task thread#6 (ParallelGC)" prio=10 tid=0x00007f5f1002e800 nid=0x71 runnable
"GC task thread#7 (ParallelGC)" prio=10 tid=0x00007f5f10030000 nid=0x72 runnable
"GC task thread#8 (ParallelGC)" prio=10 tid=0x00007f5f10032000 nid=0x73 runnable
"GC task thread#9 (ParallelGC)" prio=10 tid=0x00007f5f10034000 nid=0x74 runnable
"GC task thread#10 (ParallelGC)" prio=10 tid=0x00007f5f10036000 nid=0x75 runnable
"GC task thread#11 (ParallelGC)" prio=10 tid=0x00007f5f10037800 nid=0x76 runnable
"GC task thread#12 (ParallelGC)" prio=10 tid=0x00007f5f10039800 nid=0x77 runnable
"GC task thread#13 (ParallelGC)" prio=10 tid=0x00007f5f1003b800 nid=0x78 runnable
"GC task thread#14 (ParallelGC)" prio=10 tid=0x00007f5f1003d000 nid=0x79 runnable
"GC task thread#15 (ParallelGC)" prio=10 tid=0x00007f5f1003f000 nid=0x7a runnable
"GC task thread#16 (ParallelGC)" prio=10 tid=0x00007f5f10041000 nid=0x7b runnable
"GC task thread#17 (ParallelGC)" prio=10 tid=0x00007f5f10043000 nid=0x7c runnable
"GC task thread#18 (ParallelGC)" prio=10 tid=0x00007f5f10044800 nid=0x7d runnable
"GC task thread#19 (ParallelGC)" prio=10 tid=0x00007f5f10046800 nid=0x7e runnable
"GC task thread#20 (ParallelGC)" prio=10 tid=0x00007f5f10048800 nid=0x7f runnable
"GC task thread#21 (ParallelGC)" prio=10 tid=0x00007f5f1004a000 nid=0x80 runnable
"GC task thread#22 (ParallelGC)" prio=10 tid=0x00007f5f1004c000 nid=0x81 runnable
"VM Periodic Task Thread" prio=10 tid=0x00007f5f100d5000 nid=0x8b waiting on condition
JNI global references: 530
Heap PSYoungGen      total 76800K, used 63133K [0x00000000faa80000, 0x0000000100000000, 0x0000000100000000)
 eden space 66048K, 95% used [0x00000000faa80000,0x00000000fe827690,0x00000000feb00000)  from
space 10752K, 0% used [0x00000000ff580000,0x00000000ff580000,0x0000000100000000)  to   space
10752K, 0% used [0x00000000feb00000,0x00000000feb00000,0x00000000ff580000) ParOldGen     
 total 175104K, used 175046K [0x00000000eff80000, 0x00000000faa80000, 0x00000000faa80000)
 object space 175104K, 99% used [0x00000000eff80000,0x00000000faa71bb0,0x00000000faa80000)
PSPermGen       total 29696K, used 29267K [0x00000000dff80000, 0x00000000e1c80000, 0x00000000eff80000)
 object space 29696K, 98% used [0x00000000dff80000,0x00000000e1c14d38,0x00000000e1c80000)




Any insight on clearing GC cleanly when this happens?
THanks!

 		 	   		  
Mime
View raw message