hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jean-Daniel Cryans (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2223) Handle 10min+ network partitions between clusters
Date Fri, 19 Feb 2010 00:54:27 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12835532#action_12835532
] 

Jean-Daniel Cryans commented on HBASE-2223:
-------------------------------------------

Some design notes:

We need another class to manage multiple ReplicationSources (going to many slave clusters)
between ReplicationHLog and ReplicationSrouce, let's call it ReplicationSourceManager (RSM)
for the moment. That class should be responsible to take actions and keep tabs for each outbound
stream. When a source successfully sent a batch of edits to a peer, it should report the latest
HLogKey to the RSM so that we match it to a HLog file (using the writeTime) and then publish
that in Zookeeper for each slave cluster.

We could detect that a peer is unreachable if the ReplicationSource didn't report after X
time (configurable, not sure what should be the default). Here I'm still wondering what would
be the best way to detect that a peer cluster is back... retrying connections to the peer
ZK quorum? We also need to manage if the cluster is simply shut down (using the shutdown znode).
At that point we stop queuing entries for that source and pile up all the Hlogs to process
in a list in ZK. We also need a way here of telling the Master to not delete those logs. We
should manage the fact that a hlog may be moved to the oldlogs directory so if the hlog isn't
in the local log dir, it's probably in the other directory.

When the cluster comes back, we process in order all HLogs without merging with the current
flow of entries since we would now have 2 different set of HLogs to keep track of (we could
improve this in the future). It's only when we reach the current HLog file that we flip the
switch to take new entries. I expect that to be very tricky.

Even trickier is keeping track of those HLogs when a RS dies on the master cluster. The pile
of HLogs to process will still be in ZK along the latest HLogKey that was processed. It means
we have to somehow hand off that processing to some or one RS. What I'm thinking is that the
master, when done splitting logs, should hand that pile to a single RS which will open a new
ReplicationSource and hopefully complete the replication.

We can use the information published in ZK to learn the situation of each replication stream
per peer and show that in a UI.

> Handle 10min+ network partitions between clusters
> -------------------------------------------------
>
>                 Key: HBASE-2223
>                 URL: https://issues.apache.org/jira/browse/HBASE-2223
>             Project: Hadoop HBase
>          Issue Type: Sub-task
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.21.0
>
>
> We need a nice way of handling long network partitions without impacting a master cluster
(which pushes the data). Currently it will just retry over and over again.
> I think we could:
>  - Stop replication to a slave cluster if it didn't respond for more than 10 minutes
>  - Keep track of the duration of the partition
>  - When the slave cluster comes back, initiate a MR job like HBASE-2221 
> Maybe we want less than 10 minutes, maybe we want this to be all automatic or just the
first 2 parts. Discuss.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message