hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-12769) Replication fails to delete all corresponding zk nodes when peer is removed
Date Mon, 29 Dec 2014 23:08:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14260527#comment-14260527
] 

Enis Soztutar commented on HBASE-12769:
---------------------------------------

i think proper solution should be doing HBASE-11392 first, then keeping the state of the "remove
peer" operation in master and ensure that it is completed via HBASE-5487 or HBASE-12439. 

> Replication fails to delete all corresponding zk nodes when peer is removed
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-12769
>                 URL: https://issues.apache.org/jira/browse/HBASE-12769
>             Project: HBase
>          Issue Type: Improvement
>          Components: Replication
>    Affects Versions: 0.99.2
>            Reporter: cuijianwei
>            Priority: Minor
>
> When removing a peer, the client side will delete peerId under peersZNode node; then
alive region servers will be notified and delete corresponding hlog queues under its rsZNode
of replication. However, if there are failed servers whose hlog queues have not been transferred
by alive servers(this likely happens if setting a big value to "replication.sleep.before.failover"
and lots of region servers restarted), these hlog queues won't be deleted after the peer is
removed. I think remove_peer should guarantee all corresponding zk nodes have been removed
after it completes; otherwise, if we create a new peer with the same peerId with the removed
one, there might be unexpected data to be replicated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message