hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl
Date Thu, 20 Aug 2015 22:34:49 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14705903#comment-14705903
] 

stack edited comment on HBASE-14274 at 8/20/15 10:34 PM:
---------------------------------------------------------

This related to your change? Should protect against it?

{code}
119113 2015-08-20 15:31:10,704 WARN  [HBase-Metrics2-1] impl.MetricsConfig(124): Cannot locate
configuration: tried hadoop-metrics2-hbase.properties,hadoop-metrics2.properties
119114 2015-08-20 15:31:10,710 ERROR [HBase-Metrics2-1] lib.MethodMetric$2(118): Error invoking
method getBlocksTotal
119115 java.lang.reflect.InvocationTargetException
119116 ›   at sun.reflect.GeneratedMethodAccessor72.invoke(Unknown Source)
119117 ›   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
119118 ›   at java.lang.reflect.Method.invoke(Method.java:606)
119119 ›   at org.apache.hadoop.metrics2.lib.MethodMetric$2.snapshot(MethodMetric.java:111)
119120 ›   at org.apache.hadoop.metrics2.lib.MethodMetric.snapshot(MethodMetric.java:144)
119121 ›   at org.apache.hadoop.metrics2.lib.MetricsRegistry.snapshot(MetricsRegistry.java:387)
119122 ›   at org.apache.hadoop.metrics2.lib.MetricsSourceBuilder$1.getMetrics(MetricsSourceBuilder.java:79)
119123 ›   at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
119124 ›   at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
119125 ›   at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
119126 ›   at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333)
119127 ›   at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319)
119128 ›   at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
119129 ›   at org.apache.hadoop.metrics2.util.MBeans.register(MBeans.java:57)
119130 ›   at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.startMBeans(MetricsSourceAdapter.java:221)
119131 ›   at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.start(MetricsSourceAdapter.java:96)
119132 ›   at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.registerSource(MetricsSystemImpl.java:245)
119133 ›   at org.apache.hadoop.metrics2.impl.MetricsSystemImpl$1.postStart(MetricsSystemImpl.java:229)
119134 ›   at sun.reflect.GeneratedMethodAccessor50.invoke(Unknown Source)
119135 ›   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
119136 ›   at java.lang.reflect.Method.invoke(Method.java:606)
119137 ›   at org.apache.hadoop.metrics2.impl.MetricsSystemImpl$3.invoke(MetricsSystemImpl.java:290)
119138 ›   at com.sun.proxy.$Proxy13.postStart(Unknown Source)
119139 ›   at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.start(MetricsSystemImpl.java:185)
119140 ›   at org.apache.hadoop.metrics2.impl.JmxCacheBuster$JmxCacheBusterRunnable.run(JmxCacheBuster.java:81)
119141 ›   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
119142 ›   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
119143 ›   at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
119144 ›   at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
119145 ›   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
119146 ›   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
119147 ›   at java.lang.Thread.run(Thread.java:744)
119148 Caused by: java.lang.NullPointerException
119149 ›   at org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.size(BlocksMap.java:198)
119150 ›   at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.getTotalBlocks(BlockManager.java:3158)
119151 ›   at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlocksTotal(FSNamesystem.java:5652)
119152 ›   ... 32 more
{code}

In particular the NPE there near the end.


was (Author: stack):
This related to your change? Should protect against it?

{code}
119113 2015-08-20 15:31:10,704 WARN  [HBase-Metrics2-1] impl.MetricsConfig(124): Cannot locate
configuration: tried hadoop-metrics2-hbase.properties,hadoop-metrics2.properties
119114 2015-08-20 15:31:10,710 ERROR [HBase-Metrics2-1] lib.MethodMetric$2(118): Error invoking
method getBlocksTotal
119115 java.lang.reflect.InvocationTargetException
119116 ›   at sun.reflect.GeneratedMethodAccessor72.invoke(Unknown Source)
119117 ›   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
119118 ›   at java.lang.reflect.Method.invoke(Method.java:606)
119119 ›   at org.apache.hadoop.metrics2.lib.MethodMetric$2.snapshot(MethodMetric.java:111)
119120 ›   at org.apache.hadoop.metrics2.lib.MethodMetric.snapshot(MethodMetric.java:144)
119121 ›   at org.apache.hadoop.metrics2.lib.MetricsRegistry.snapshot(MetricsRegistry.java:387)
119122 ›   at org.apache.hadoop.metrics2.lib.MetricsSourceBuilder$1.getMetrics(MetricsSourceBuilder.java:79)
119123 ›   at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
119124 ›   at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
119125 ›   at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
119126 ›   at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333)
119127 ›   at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319)
119128 ›   at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
119129 ›   at org.apache.hadoop.metrics2.util.MBeans.register(MBeans.java:57)
119130 ›   at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.startMBeans(MetricsSourceAdapter.java:221)
119131 ›   at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.start(MetricsSourceAdapter.java:96)
119132 ›   at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.registerSource(MetricsSystemImpl.java:245)
119133 ›   at org.apache.hadoop.metrics2.impl.MetricsSystemImpl$1.postStart(MetricsSystemImpl.java:229)
119134 ›   at sun.reflect.GeneratedMethodAccessor50.invoke(Unknown Source)
119135 ›   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
119136 ›   at java.lang.reflect.Method.invoke(Method.java:606)
119137 ›   at org.apache.hadoop.metrics2.impl.MetricsSystemImpl$3.invoke(MetricsSystemImpl.java:290)
119138 ›   at com.sun.proxy.$Proxy13.postStart(Unknown Source)
119139 ›   at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.start(MetricsSystemImpl.java:185)
119140 ›   at org.apache.hadoop.metrics2.impl.JmxCacheBuster$JmxCacheBusterRunnable.run(JmxCacheBuster.java:81)
119141 ›   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
119142 ›   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
119143 ›   at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
119144 ›   at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
119145 ›   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
119146 ›   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
119147 ›   at java.lang.Thread.run(Thread.java:744)
119148 Caused by: java.lang.NullPointerException
119149 ›   at org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.size(BlocksMap.java:198)
119150 ›   at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.getTotalBlocks(BlockManager.java:3158)
119151 ›   at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlocksTotal(FSNamesystem.java:5652)
119152 ›   ... 32 more
{code}

> Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-14274
>                 URL: https://issues.apache.org/jira/browse/HBASE-14274
>             Project: HBase
>          Issue Type: Sub-task
>          Components: test
>            Reporter: stack
>         Attachments: 23612.stack, HBASE-14274-v1.patch, HBASE-14274.patch
>
>
> Looking into parent issue, got a hang locally of TestDistributedLogReplay.
> We have region closes here:
> {code}
> "RS_CLOSE_META-localhost:59610-0" prio=5 tid=0x00007ff65c03f800 nid=0x54347 waiting on
condition [0x000000011f7ac000]
>    java.lang.Thread.State: WAITING (parking)
> 	at sun.misc.Unsafe.park(Native Method)
> 	- parking to wait for  <0x000000075636d8c0> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
> 	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
> 	at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
> 	at org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
> 	at org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
> 	at org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
> 	- locked <0x00000007ff878190> (a java.lang.Object)
> 	at org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
> 	at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:744)
> {code}
> They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to get a write
lock on this classes local ReentrantReadWriteLock while holding MetricsRegionSourceImpl's
readWriteLock write lock.
> Then, elsewhere the JmxCacheBuster is running trying to get metrics with above locks
held in reverse:
> {code}
> "HBase-Metrics2-1" daemon prio=5 tid=0x00007ff65e14b000 nid=0x59a03 waiting on condition
[0x0000000140ea5000]
>    java.lang.Thread.State: WAITING (parking)
> 	at sun.misc.Unsafe.park(Native Method)
> 	- parking to wait for  <0x00000007cade1480> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
> 	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
> 	at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
> 	at org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
> 	at org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.getMetrics(MetricsRegionAggregateSourceImpl.java:115)
> 	at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
> 	at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
> 	at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
> 	at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333)
> 	at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319)
> 	at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
> 	at org.apache.hadoop.metrics2.util.MBeans.register(MBeans.java:57)
> 	at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.startMBeans(MetricsSourceAdapter.java:221)
> 	- locked <0x00000007e654bdc0> (a org.apache.hadoop.metrics2.impl.MetricsSourceAdapter)
> 	at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.start(MetricsSourceAdapter.java:96)
> 	at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.registerSource(MetricsSystemImpl.java:245)
> 	- locked <0x0000000754302660> (a org.apache.hadoop.metrics2.impl.MetricsSystemImpl)
> 	at org.apache.hadoop.metrics2.impl.MetricsSystemImpl$1.postStart(MetricsSystemImpl.java:229)
> 	at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.hadoop.metrics2.impl.MetricsSystemImpl$3.invoke(MetricsSystemImpl.java:290)
> 	at com.sun.proxy.$Proxy13.postStart(Unknown Source)
> 	at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.start(MetricsSystemImpl.java:185)
> 	- locked <0x0000000754302660> (a org.apache.hadoop.metrics2.impl.MetricsSystemImpl)
> 	at org.apache.hadoop.metrics2.impl.JmxCacheBuster$JmxCacheBusterRunnable.run(JmxCacheBuster.java:81)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message