hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-19216) Use procedure to execute replication peer related operations
Date Fri, 17 Nov 2017 01:33:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-19216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16256287#comment-16256287

stack commented on HBASE-19216:

bq. For a peer change, I think it is idempotent, so we can retry forever if an RS fails to
report in.

Ok. We just need to stop pinging if the server goes away.

bq. I plan to add a reportProcedureDone method in RegionServerStatusService

Ok. Should do for a few procedure types.

bq. How can I wake up a suspended procedure?

In Assign/Unassign, we have RegionStateNodes that have in them a reference to the Procedure
that is manipulating the RS and an associated ProcedureEvent.  Suspend/resume operates on
the RSN PE. Before we dispatch an RPC, we do a suspend on the RSN PE. When RS has transitioned
the Region, it updates master by calling reportRegionStateTransition.  Master finds the pertinent
RSN using RegionInfo as key. We pull out the Procedure and call reportTransition on it. After
updating state in the Procedure, the last thing done is a wake up call on the PE.

We'd have a registry of Peers in Master (ReplicationPeers?) keyed by peerid?. The Peer in
Master would carry Procedure and PE reference.

Something like that.

bq. I need to create one by myself when suspending the procedure and store it in the procedure,
so I can get it through the procedureId?

When we create a Peer, it would have in it a PE. The PE would not be created each time we
want to do a suspend because we want to guard against having more than one operation going
on against a Peer at a time. The key could be procedureid but could it be peerid instead?

So, setting peer would work like 

> Use procedure to execute replication peer related operations
> ------------------------------------------------------------
>                 Key: HBASE-19216
>                 URL: https://issues.apache.org/jira/browse/HBASE-19216
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Duo Zhang
> When building the basic framework for HBASE-19064, I found that the enable/disable peer
is built upon the watcher of zk.
> The problem of using watcher is that, you do not know the exact time when all RSes in
the cluster have done the change, it is a 'eventually done'. 
> And for synchronous replication, when changing the state of a replication peer, we need
to know the exact time as we can only enable read/write after that time. So I think we'd better
use procedure to do this. Change the flag on zk, and then execute a procedure on all RSes
to reload the flag from zk.
> Another benefit is that, after the change, zk will be mainly used as a storage, so it
will be easy to implement another replication peer storage to replace zk so that we can reduce
the dependency on zk.

This message was sent by Atlassian JIRA

View raw message