hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Duo Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-19216) Use procedure to execute replication peer related operations
Date Wed, 15 Nov 2017 14:02:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-19216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253478#comment-16253478

Duo Zhang commented on HBASE-19216:

I plan to add a reportProcedureDone method in RegionServerStatusService. Fro the failing path
I think the current framework can work well. We can retry for ever if the remote procedure
call can not be sent, and finally a remoteCallFailed will be triggered and we can give up

But for the normal path, I can get a full picture but some details are still behind the misty.
I plan to add a procedureId in the request, and RS will report back the procedureId when done.
We can get a procedure with this procedureId, but then I'm a little confused. How can I wake
up a suspended procedure? There seems to be a ProcedureEvent, then how is it generated, and
how can I get it when I only have a procedureId? I need to create one by myself when suspending
the procedure and store it in the procedure, so I can get it through the procedureId?

Help expected... Still a beginner on the procedure v2 framework... Thanks sir [~stack].

> Use procedure to execute replication peer related operations
> ------------------------------------------------------------
>                 Key: HBASE-19216
>                 URL: https://issues.apache.org/jira/browse/HBASE-19216
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Duo Zhang
> When building the basic framework for HBASE-19064, I found that the enable/disable peer
is built upon the watcher of zk.
> The problem of using watcher is that, you do not know the exact time when all RSes in
the cluster have done the change, it is a 'eventually done'. 
> And for synchronous replication, when changing the state of a replication peer, we need
to know the exact time as we can only enable read/write after that time. So I think we'd better
use procedure to do this. Change the flag on zk, and then execute a procedure on all RSes
to reload the flag from zk.
> Another benefit is that, after the change, zk will be mainly used as a storage, so it
will be easy to implement another replication peer storage to replace zk so that we can reduce
the dependency on zk.

This message was sent by Atlassian JIRA

View raw message