hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kihwal Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11192) OOM during Quota Initialization lead to Namenode hang
Date Fri, 20 Jan 2017 18:28:26 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15832211#comment-15832211
] 

Kihwal Lee commented on HDFS-11192:
-----------------------------------

Looking at the java source, {{Thread.start0()}} is mapped to a JVM native method, {{JVM_StartThread()}}.
Thread's uncaught exception handling is wired at the Thread class so that it is called on
exit. So setting the handler for worker threads won't do any good.

But the exception does bubble up and is handled by the ThredGroup's hander. In this case,
it's the namenode's main thread. So we could set the UncaughtExceptionHandler in caller thread
so that it terminates on OOM.  It might be better to set it before a quota init and set it
back to the default after to avoid other subsystems (e.g. jetty) causing exit. 

{{ForkJoinPool}} is also used by jdk itself in jdk8 (e.g. {{ConcurrentHashMap}}), so even
with this fix, you can run into strange problems if the process cannot create more threads.

> OOM during Quota Initialization lead to Namenode hang
> -----------------------------------------------------
>
>                 Key: HDFS-11192
>                 URL: https://issues.apache.org/jira/browse/HDFS-11192
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Brahma Reddy Battula
>            Assignee: Brahma Reddy Battula
>         Attachments: namenodeThreadDump.out
>
>
> AFAIK ,In RecurisveTask Execution, When ForkjoinThreadpool's thread dies or not able
to create,it will not notify the parent.Parent still waiting for the notify call..that's not
timed waiting also.
>  *Trace from Namenode log* 
> {noformat}
> Exception in thread "ForkJoinPool-1-worker-2" Exception in thread "ForkJoinPool-1-worker-3"
java.lang.OutOfMemoryError: unable to create new native thread
>         at java.lang.Thread.start0(Native Method)
>         at java.lang.Thread.start(Thread.java:714)
>         at java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1486)
>         at java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1517)
>         at java.util.concurrent.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1609)
>         at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:167)
> java.lang.OutOfMemoryError: unable to create new native thread
>         at java.lang.Thread.start0(Native Method)
>         at java.lang.Thread.start(Thread.java:714)
>         at java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1486)
>         at java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1517)
>         at java.util.concurrent.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1609)
>         at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:167)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message