hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Duo Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-19216) Use procedure to execute replication peer related operations
Date Wed, 15 Nov 2017 14:02:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-19216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253478#comment-16253478
] 

Duo Zhang commented on HBASE-19216:
-----------------------------------

I plan to add a reportProcedureDone method in RegionServerStatusService. Fro the failing path
I think the current framework can work well. We can retry for ever if the remote procedure
call can not be sent, and finally a remoteCallFailed will be triggered and we can give up
retrying.

But for the normal path, I can get a full picture but some details are still behind the misty.
I plan to add a procedureId in the request, and RS will report back the procedureId when done.
We can get a procedure with this procedureId, but then I'm a little confused. How can I wake
up a suspended procedure? There seems to be a ProcedureEvent, then how is it generated, and
how can I get it when I only have a procedureId? I need to create one by myself when suspending
the procedure and store it in the procedure, so I can get it through the procedureId?

Help expected... Still a beginner on the procedure v2 framework... Thanks sir [~stack].

> Use procedure to execute replication peer related operations
> ------------------------------------------------------------
>
>                 Key: HBASE-19216
>                 URL: https://issues.apache.org/jira/browse/HBASE-19216
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Duo Zhang
>
> When building the basic framework for HBASE-19064, I found that the enable/disable peer
is built upon the watcher of zk.
> The problem of using watcher is that, you do not know the exact time when all RSes in
the cluster have done the change, it is a 'eventually done'. 
> And for synchronous replication, when changing the state of a replication peer, we need
to know the exact time as we can only enable read/write after that time. So I think we'd better
use procedure to do this. Change the flag on zk, and then execute a procedure on all RSes
to reload the flag from zk.
> Another benefit is that, after the change, zk will be mainly used as a storage, so it
will be easy to implement another replication peer storage to replace zk so that we can reduce
the dependency on zk.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message