hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17374) ZKPermissionWatcher crashed when grant after close region
Date Tue, 27 Dec 2016 14:51:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15780543#comment-15780543
] 

Ted Yu commented on HBASE-17374:
--------------------------------

{code}
790	      if (ref-1 <= 0 && shouldClose) {
{code}
When ref-1 < 0, care should be taken. The next time release() is called, we would get into
the if block which triggers the abort.
If there is no release call in the future, we have memory leak.

> ZKPermissionWatcher crashed when grant after close region 
> ----------------------------------------------------------
>
>                 Key: HBASE-17374
>                 URL: https://issues.apache.org/jira/browse/HBASE-17374
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.98.15
>            Reporter: Liu Junhong
>            Priority: Critical
>         Attachments: 0001-fix-for-HBASE-17374.patch
>
>
> It was occurred many time that  I granted some permission,  but few of some regionservers
was not token effect and must be restart . When I look up logs,  I found that :
> 2016-12-08 15:00:26,238 DEBUG [RS_CLOSE_REGION-dx-data-hbase-watcher05:60020-0] handler.CloseRegionHandler
(CloseRegionHandler.java:process(128)) - Processing close of data-probe-test,,1481180420784.5f06cb6447343b602e05996bfd87ce14.
> {color:red} 2016-12-08 15:00:26,242 DEBUG [RS_CLOSE_REGION-dx-data-hbase-watcher05:60020-0]
regionserver.HRegion (HRegion.java:doClose(1163)) - Closing data-probe-test,,1481180420784.5f06cb6447343b602e05996bfd87ce14.:
disabling compactions & flushes {color}
> 2016-12-08 15:00:26,242 DEBUG [RS_CLOSE_REGION-dx-data-hbase-watcher05:60020-0] regionserver.HRegion
(HRegion.java:doClose(1190)) - Updates disabled for region data-probe-test,,1481180420784.5f06cb6447343b602e05996bfd87ce14.
> 2016-12-08 15:00:26,242 INFO  [RS_CLOSE_REGION-dx-data-hbase-watcher05:60020-0] regionserver.HRegion
(HRegion.java:internalFlushcache(1753)) - Started memstore flush for data-probe-test,,1481180420784.5f06cb6447343b602e05996bfd87ce14.,
current region memstore size 160
> 2016-12-08 15:00:26,284 INFO  [RS_CLOSE_REGION-dx-data-hbase-watcher05:60020-0] regionserver.DefaultStoreFlusher
(DefaultStoreFlusher.java:flushSnapshot(95)) - Flushed, sequenceid=6, memsize=160, hasBloomFilter=true,
into tmp file hdfs://dx-data-hbase-watcher/hbase/data/default/data-probe-test/5f06cb6447343b602e05996bfd87ce14/.tmp/8d734ce3d93e40628d8f82111e754cb3
> 2016-12-08 15:00:26,303 DEBUG [RS_CLOSE_REGION-dx-data-hbase-watcher05:60020-0] regionserver.HRegionFileSystem
(HRegionFileSystem.java:commitStoreFile(370)) - Committing store file hdfs://dx-data-hbase-watcher/hbase/data/default/data-probe-test/5f06cb6447343b602e05996bfd87ce14/.tmp/8d734ce3d93e40628d8f82111e754cb3
as hdfs://dx-data-hbase-watcher/hbase/data/default/data-probe-test/5f06cb6447343b602e05996bfd87ce14/cf2/8d734ce3d93e40628d8f82111e754cb3
> 2016-12-08 15:00:26,318 INFO  [RS_CLOSE_REGION-dx-data-hbase-watcher05:60020-0] regionserver.HStore
(HStore.java:commitFile(877)) - Added hdfs://dx-data-hbase-watcher/hbase/data/default/data-probe-test/5f06cb6447343b602e05996bfd87ce14/cf2/8d734ce3d93e40628d8f82111e754cb3,
entries=1, sequenceid=6, filesize=985
> 2016-12-08 15:00:26,319 INFO  [RS_CLOSE_REGION-dx-data-hbase-watcher05:60020-0] regionserver.HRegion
(HRegion.java:internalFlushcache(1920)) - Finished memstore flush of ~160/160, currentsize=0/0
for region data-probe-test,,1481180420784.5f06cb6447343b602e05996bfd87ce14. in 77ms, sequenceid=6,
compaction requested=false
> 2016-12-08 15:00:26,323 INFO  [StoreCloserThread-data-probe-test,,1481180420784.5f06cb6447343b602e05996bfd87ce14.-1]
regionserver.HStore (HStore.java:close(774)) - Closed cf1
> 2016-12-08 15:00:26,325 INFO  [StoreCloserThread-data-probe-test,,1481180420784.5f06cb6447343b602e05996bfd87ce14.-1]
regionserver.HStore (HStore.java:close(774)) - Closed cf2
> 2016-12-08 15:00:26,326 DEBUG [RS_CLOSE_REGION-dx-data-hbase-watcher05:60020-0] coprocessor.CoprocessorHost
(CoprocessorHost.java:shutdown(292)) - Stop coprocessor org.apache.hadoop.hbase.security.token.TokenProvider
> {color:red}  2016-12-08 15:00:26,326 DEBUG [RS_CLOSE_REGION-dx-data-hbase-watcher05:60020-0]
coprocessor.CoprocessorHost (CoprocessorHost.java:shutdown(292)) - Stop coprocessor org.apache.hadoop.hbase.security.access.AccessController
 {color}
> 2016-12-08 15:00:26,327 DEBUG [RS_CLOSE_REGION-dx-data-hbase-watcher05:60020-0] coprocessor.CoprocessorHost
(CoprocessorHost.java:shutdown(292)) - Stop coprocessor org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint
> 2016-12-08 15:00:26,327 DEBUG [RS_CLOSE_REGION-dx-data-hbase-watcher05:60020-0] coprocessor.CoprocessorHost
(CoprocessorHost.java:shutdown(292)) - Stop coprocessor org.apache.hadoop.hbase.regionserver.ExternalMetricObserver
> 2016-12-08 15:00:26,328 INFO  [RS_CLOSE_REGION-dx-data-hbase-watcher05:60020-0] regionserver.HRegion
(HRegion.java:doClose(1279)) - Closed data-probe-test,,1481180420784.5f06cb6447343b602e05996bfd87ce14.
> {color:red}  2016-12-08 15:00:27,590 ERROR [regionserver60020-EventThread] zookeeper.ClientCnxn
(ClientCnxn.java:processEvent(524)) - Error while calling watcher
> java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@1851ab3a
rejected from java.util.concurrent.ThreadPoolExecutor@19c0794f[Terminated, pool size = 0,
active threads = 0, queued tasks = 0, completed tasks = 1]
>         at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
>         at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>         at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
>         at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:110)
>         at java.util.concurrent.Executors$DelegatedExecutorService.submit(Executors.java:638)
>         at org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeDataChanged(ZKPermissionWatcher.java:162)
>         at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:461)
>         at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
>         at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> 2016-12-08 15:00:27,595 ERROR [regionserver60020-EventThread] zookeeper.ClientCnxn (ClientCnxn.java:processEvent(524))
- Error while calling watcher
> java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@646fdb38
rejected from java.util.concurrent.ThreadPoolExecutor@19c0794f[Terminated, pool size = 0,
active threads = 0, queued tasks = 0, completed tasks = 1]
>         at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
>         at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>         at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
>         at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:110)
>         at java.util.concurrent.Executors$DelegatedExecutorService.submit(Executors.java:638)
>         at org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:201)
>         at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:468)
>         at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
>         at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {color}
> 2016-12-08 15:01:19,389 INFO  [PriorityRpcServer.handler=6,queue=0,port=60020] regionserver.HRegionServer
(HRegionServer.java:openRegion(3935)) - Open hbase:acl,,1480596080128.854f160ee9c3f2d1b591d6dbf211956e.
> 2016-12-08 15:01:19,443 DEBUG [RS_OPEN_REGION-dx-data-hbase-watcher05:60020-1] zookeeper.ZKAssign
(ZKAssign.java:transitionNode(832)) - regionserver:60020-0x358ba43097e1ada, quorum=dx-data-hbase-watcher02:2181,dx-data-hbase-watcher01:2181,dx-data-hbase-watcher03:2181,
baseZNode=/hbase Transitioning 854f160ee9c3f2d1b591d6dbf211956e from M_ZK_REGION_OFFLINE to
RS_ZK_REGION_OPENING
> {color:red}
> After look up source, I found that :
> 1 AccessController implements RegionObserver and  RegionServerObserver.
> 2 The reference of "ZooKeeperWatcher zk" is a same one object when the instance of AccessController
was be used as RegionObserver  and RegionServerObserver.
> 3 When closing region, org.apache.hadoop.hbase.security.access.AccessController.stop(CoprocessorEnvironment)
,  TableAuthManager.release(authManager),    org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.close()
will be called.
> 4 If there is only one region on regionserver and I close it , TableAuthManager.release(authManager)
will close the instance of TableAuthManager and shutdown the ExecutorService of  ZKPermissionWatcher
.
> 5 If I grant a new permission when the ExecutorService is terminated, the exception before
whitch is a runtimeexcepetion will occurs, and at last the zookeeper watcher was crashed.
> So, The instance of TableAuthManager should not be close unless regionserver is closing.
> {color}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message