hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-12594) Deadlock in metrics subsystem
Date Tue, 24 Nov 2015 17:13:10 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-12594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15024875#comment-15024875
] 

Jason Lowe commented on HADOOP-12594:
-------------------------------------

The deadlock occurred because a Jetty thread was trying to handle a JMX metrics request just
as the metrics timer fired and was gathering a snapshot.
{noformat}
"324490955@qtp-119819655-4445":
        at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.getMetrics(MetricsSystemImpl.java:564)
        - waiting to lock <0x00000003097c02c8> (a org.apache.hadoop.metrics2.impl.MetricsSystemImpl)
        at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:200)
        at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:178)
        - locked <0x000000030ab29680> (a org.apache.hadoop.metrics2.impl.MetricsSourceAdapter)
        at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:155)
        at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBeanInfo(DefaultMBeanServerInterceptor.java:1378)
        at com.sun.jmx.mbeanserver.JmxMBeanServer.getMBeanInfo(JmxMBeanServer.java:920)
        at org.apache.hadoop.jmx.JMXJsonServlet.listBeans(JMXJsonServlet.java:248)
        at org.apache.hadoop.jmx.JMXJsonServlet.doGet(JMXJsonServlet.java:210)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
        at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:66)
        at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900)
        at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834)
        at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebAppFilter.doFilter(RMWebAppFilter.java:142)
        at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795)
        at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
        at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
        at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118)
        at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
[...]
"Timer for 'ResourceManager' metrics system":
        at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:194)
        - waiting to lock <0x000000030ab29680> (a org.apache.hadoop.metrics2.impl.MetricsSourceAdapter)
        at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.snapshotMetrics(MetricsSystemImpl.java:419)
        at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.sampleMetrics(MetricsSystemImpl.java:410)
        - locked <0x00000003097c02c8> (a org.apache.hadoop.metrics2.impl.MetricsSystemImpl)
        at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.onTimerEvent(MetricsSystemImpl.java:381)
        - locked <0x00000003097c02c8> (a org.apache.hadoop.metrics2.impl.MetricsSystemImpl)
        at org.apache.hadoop.metrics2.impl.MetricsSystemImpl$4.run(MetricsSystemImpl.java:368)
        at java.util.TimerThread.mainLoop(Timer.java:555)
        at java.util.TimerThread.run(Timer.java:505)

Found 1 deadlock.
{noformat}

The timer thread has the MetricsSystemImpl lock and is trying to grab the MetricsSourceAdapter
lock.  In the meantime the JMX thread has the MetricsSourceAdapter lock and is trying to grab
the MetricsSystemImpl lock.  The locking order isn't consistent so we deadlocked.


> Deadlock in metrics subsystem
> -----------------------------
>
>                 Key: HADOOP-12594
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12594
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 2.7.1
>            Reporter: Jason Lowe
>            Priority: Critical
>
> Saw a YARN ResourceManager process encounter a deadlock which appears to be caused by
the metrics subsystem.  Stack trace to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message