hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16947) Some improvements for DumpReplicationQueues tool
Date Thu, 08 Dec 2016 17:05:59 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732758#comment-15732758

Hudson commented on HBASE-16947:

SUCCESS: Integrated in Jenkins build HBase-0.98-matrix #424 (See [https://builds.apache.org/job/HBase-0.98-matrix/424/])
HBASE-16947 Some improvements for DumpReplicationQueues tool (apurtell: rev 52566bb0325b9f9ac38450ae04c7a9e5892a493d)
* (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/replication/ReplicationQueuesClientZKImpl.java
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/DumpReplicationQueues.java
* (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/replication/ReplicationQueuesZKImpl.java

> Some improvements for DumpReplicationQueues tool
> ------------------------------------------------
>                 Key: HBASE-16947
>                 URL: https://issues.apache.org/jira/browse/HBASE-16947
>             Project: HBase
>          Issue Type: Improvement
>          Components: Operability, Replication
>    Affects Versions: 2.0.0, 1.4.0
>            Reporter: Guanghao Zhang
>            Assignee: Guanghao Zhang
>             Fix For: 2.0.0, 1.4.0, 1.3.1, 0.98.24
>         Attachments: HBASE-16947-branch-1.patch, HBASE-16947-branch-1.patch, HBASE-16947-branch-1.patch,
HBASE-16947-v1.patch, HBASE-16947.patch
> Recently we met too many replication WALs problem in our production cluster. We need
the DumpReplicationQueues tool to analyze the replication queues info in zookeeper. So I backport
HBASE-16450 to our branch based 0.98 and did some improvements for it.
> 1. Show the dead regionservers under replication/rs znode. When there are too many WALs
under znode, it can't be atomic transferred to new rs znode. So the dead rs znode will be
leaved on zookeeper.
> 2. Make a summary about all the queues that belong to peer has been deleted. 
> 3. Aggregate all regionservers' size of replication queue. Now the regionserver report
ReplicationLoad to master, but there were not a aggregate metrics for replication.
> 4. Show how many WALs which can not found on hdfs. But the reason (WAL Not Found) need
more time to dig.

This message was sent by Atlassian JIRA

View raw message