hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-9618) Add thread which detects JVM pauses
Date Tue, 04 Jun 2013 18:37:25 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674684#comment-13674684

Todd Lipcon commented on HADOOP-9618:

BTW, the info reported from the beans seems to be off due to an OpenJDK bug. When I run the
same test program with Oracle JDK 1.6.0_14 I get correct stats from the CMS MXBean:

13/06/04 11:36:33 INFO util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC):
pause of approximately 3232ms
GC pool 'ParNew' had collection(s): count=1 time=56ms
GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=3665ms

> Add thread which detects JVM pauses
> -----------------------------------
>                 Key: HADOOP-9618
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9618
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: util
>    Affects Versions: 3.0.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-9618.txt
> Often times users struggle to understand what happened when a long JVM pause (GC or otherwise)
causes things to malfunction inside a Hadoop daemon. For example, a long GC pause while logging
an edit to the QJM may cause the edit to timeout, or a long GC pause may make other IPCs to
the NameNode timeout. We should add a simple thread which loops on 1-second sleeps, and if
the sleep ever takes significantly longer than 1 second, log a WARN. This will make GC pauses
obvious in logs.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message