hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "huaxiang sun (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18363) Hbck option to undeploy in memory replica parent region
Date Wed, 19 Jul 2017 01:34:02 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16092454#comment-16092454
] 

huaxiang sun commented on HBASE-18363:
--------------------------------------

I checked the hbck code, "-fixAssignments" should be able to fix this in-memory state. I simulated
this case
{code}
2017-07-18 18:19:10,192 INFO  [main-EventThread] zookeeper.ClientCnxn: EventThread shut down
2017-07-18 18:19:10,192 INFO  [main] zookeeper.ZooKeeper: Session: 0x15d5869d2f50014 closed
2017-07-18 18:19:10,192 INFO  [main] util.HBaseFsck: Checking and fixing region consistency
*ERROR: Region { meta => null, hdfs => null, deployed => dhcp-172-16-1-203.pa.cloudera.com,60863,1500426918520;t1,r1,1500328224175_0001.d761ef3cc03d8a0124bb751f216f9285.,
replicaId => 1 } not in META, but deployed on dhcp-172-16-1-203.pa.cloudera.com,60863,1500426918520
ERROR: No regioninfo in Meta or HDFS. { meta => null, hdfs => null, deployed => dhcp-172-16-1-203.pa.cloudera.com,60863,1500426918520;t1,r1,1500328224175_0001.d761ef3cc03d8a0124bb751f216f9285.,
replicaId => 1 }*
2017-07-18 18:19:10,200 INFO  [main] util.HBaseFsck: Handling overlap merges in parallel.
set hbasefsck.overlap.merge.parallel to false to run serially.
2017-07-18 18:19:10,205 INFO  [main] util.HBaseFsck: Computing mapping of all store files

2017-07-18 18:19:10,214 INFO  [main] util.HBaseFsck: Validating mapping using HDFS state
2017-07-18 18:19:10,220 INFO  [main] zookeeper.RecoverableZooKeeper: Process identifier=hbase
Fsck connecting to ZooKeeper ensemble=localhost:2181
2017-07-18 18:19:10,220 INFO  [main] zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181
sessionTimeout=90000 watcher=hbase Fsck0x0, quorum=localhost:2181, baseZNode=/hbase
2017-07-18 18:19:10,221 INFO  [main-SendThread(localhost:2181)] zookeeper.ClientCnxn: Opening
socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using
SASL (unknown error)
2017-07-18 18:19:10,222 INFO  [main-SendThread(localhost:2181)] zookeeper.ClientCnxn: Socket
connection established, initiating session, client: /127.0.0.1:60970, server: localhost/127.0.0.1:2181
2017-07-18 18:19:10,223 INFO  [main-SendThread(localhost:2181)] zookeeper.ClientCnxn: Session
establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x15d5869d2f50016,
negotiated timeout = 40000
2017-07-18 18:19:10,230 INFO  [main-EventThread] zookeeper.ClientCnxn: EventThread shut down
2017-07-18 18:19:10,230 INFO  [main] zookeeper.ZooKeeper: Session: 0x15d5869d2f50016 closed
2017-07-18 18:19:10,231 INFO  [main] zookeeper.RecoverableZooKeeper: Process identifier=hbase
Fsck connecting to ZooKeeper ensemble=localhost:2181
2017-07-18 18:19:10,231 INFO  [main] zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181
sessionTimeout=90000 watcher=hbase Fsck0x0, quorum=localhost:2181, baseZNode=/hbase
2017-07-18 18:19:10,232 INFO  [main-SendThread(localhost:2181)] zookeeper.ClientCnxn: Opening
socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using
SASL (unknown error)
2017-07-18 18:19:10,233 INFO  [main-SendThread(localhost:2181)] zookeeper.ClientCnxn: Socket
connection established, initiating session, client: /127.0.0.1:60971, server: localhost/127.0.0.1:2181
2017-07-18 18:19:10,234 INFO  [main-SendThread(localhost:2181)] zookeeper.ClientCnxn: Session
establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x15d5869d2f50017,
negotiated timeout = 40000
2017-07-18 18:19:10,236 INFO  [main-EventThread] zookeeper.ClientCnxn: EventThread shut down
2017-07-18 18:19:10,236 INFO  [main] zookeeper.ZooKeeper: Session: 0x15d5869d2f50017 closed
2017-07-18 18:19:10,236 INFO  [main] zookeeper.RecoverableZooKeeper: Process identifier=hbase
Fsck connecting to ZooKeeper ensemble=localhost:2181
2017-07-18 18:19:10,236 INFO  [main] zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181
sessionTimeout=90000 watcher=hbase Fsck0x0, quorum=localhost:2181, baseZNode=/hbase
2017-07-18 18:19:10,238 INFO  [main-SendThread(localhost:2181)] zookeeper.ClientCnxn: Opening
socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using
SASL (unknown error)
2017-07-18 18:19:10,238 INFO  [main-SendThread(localhost:2181)] zookeeper.ClientCnxn: Socket
connection established, initiating session, client: /127.0.0.1:60972, server: localhost/127.0.0.1:2181
2017-07-18 18:19:10,239 INFO  [main-SendThread(localhost:2181)] zookeeper.ClientCnxn: Session
establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x15d5869d2f50018,
negotiated timeout = 40000
2017-07-18 18:19:10,258 INFO  [main] zookeeper.ZooKeeper: Session: 0x15d5869d2f50018 closed
Summary:2017-07-18 18:19:10,258 INFO  [main-EventThread] zookeeper.ClientCnxn: EventThread
shut down

Table hbase:meta is okay.
    Number of regions: 1
    Deployed on:  dhcp-172-16-1-203.pa.cloudera.com,60863,1500426918520
Table t1 is okay.
    Number of regions: 4
    Deployed on:  dhcp-172-16-1-203.pa.cloudera.com,60863,1500426918520
Table hbase:quota is okay.
    Number of regions: 1
    Deployed on:  dhcp-172-16-1-203.pa.cloudera.com,60863,1500426918520
Table hbase:namespace is okay.
    Number of regions: 1
    Deployed on:  dhcp-172-16-1-203.pa.cloudera.com,60863,1500426918520
1 inconsistencies detected.

{code}

I was able to fix this issue by running "hbase hbck -fixAssignments".

Resolve it as invalid.

> Hbck option to undeploy in memory replica parent region 
> --------------------------------------------------------
>
>                 Key: HBASE-18363
>                 URL: https://issues.apache.org/jira/browse/HBASE-18363
>             Project: HBase
>          Issue Type: Bug
>          Components: hbck
>    Affects Versions: 1.4.0, 2.0.0-alpha-1
>            Reporter: huaxiang sun
>            Assignee: huaxiang sun
>            Priority: Minor
>
> We run into cases that with read replica, after split, sometimes, the parent replica
region is left in  master's in memory onlineRegion list. This results in the region got assigned
to a region server. Though the root cause will be fixed by HBASE-18025. We need to enhance
hbck tool to fix this in-memory state. Currently, hbck only allows the fix for primary region
(in this case, the primary region is gone) with fixAssignment option, please see the following
line of code. We will enhance it so it can be applied to replica region as well.
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java#L2216



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message