hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phil Yang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16144) Replication queue's lock will live forever if RS acquiring the lock has died prematurely
Date Fri, 01 Jul 2016 07:49:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15358591#comment-15358591
] 

Phil Yang commented on HBASE-16144:
-----------------------------------

If the RS get "session expired", RecoverableZooKeeper will try to reconnect instead of crash
itself. If we use ephemeral node for lock, after reconnect there is no lock so more than one
RS will copy the queue. In other words, if ephemeral node disappeared, we can not say the
server must have died.

> Replication queue's lock will live forever if RS acquiring the lock has died prematurely
> ----------------------------------------------------------------------------------------
>
>                 Key: HBASE-16144
>                 URL: https://issues.apache.org/jira/browse/HBASE-16144
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.2.1, 1.1.5, 0.98.20
>            Reporter: Phil Yang
>            Assignee: Phil Yang
>         Attachments: HBASE-16144-v1.patch, HBASE-16144-v2.patch
>
>
> In default, we will use multi operation when we claimQueues from ZK. But if we set hbase.zookeeper.useMulti=false,
we will add a lock first, then copy nodes, finally clean old queue and the lock. 
> However, if the RS acquiring the lock crash before claimQueues done, the lock will always
be there and other RS can never claim the queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message