hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-9465) Push entries to peer clusters serially
Date Wed, 08 Nov 2017 22:14:01 GMT

     [ https://issues.apache.org/jira/browse/HBASE-9465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Andrew Purtell updated HBASE-9465:
    Fix Version/s:     (was: 1.4.0)

I have committed and pushed the two reverts to branch-1.4 and updated fix versions here accordingly.
I ran all replication unit tests 10 times and they all passed. When working on the 1.4.0 release
candidate I'll also run the replication integration test and test a simple cross cluster replication
scenario by hand. Does anything else need to be done now? I think no, and at some point we
get doc for commit to branch-1 for a future release. Please let me know if I've missed something.

> Push entries to peer clusters serially
> --------------------------------------
>                 Key: HBASE-9465
>                 URL: https://issues.apache.org/jira/browse/HBASE-9465
>             Project: HBase
>          Issue Type: New Feature
>          Components: regionserver, Replication
>    Affects Versions: 2.0.0, 1.4.0
>            Reporter: Honghua Feng
>            Assignee: Phil Yang
>            Priority: Critical
>             Fix For: 2.0.0, 1.5.0
>         Attachments: HBASE-9465-branch-1-v1.patch, HBASE-9465-branch-1-v1.patch, HBASE-9465-branch-1-v2.patch,
HBASE-9465-branch-1-v3.patch, HBASE-9465-branch-1-v4.patch, HBASE-9465-branch-1-v4.patch,
HBASE-9465-branch-1.v4.revert.patch, HBASE-9465-v1.patch, HBASE-9465-v2.patch, HBASE-9465-v2.patch,
HBASE-9465-v3.patch, HBASE-9465-v4.patch, HBASE-9465-v5.patch, HBASE-9465-v6.patch, HBASE-9465-v6.patch,
HBASE-9465-v7.patch, HBASE-9465-v7.patch, HBASE-9465.pdf
> When region-move or RS failure occurs in master cluster, the hlog entries that are not
pushed before region-move or RS-failure will be pushed by original RS(for region move) or
another RS which takes over the remained hlog of dead RS(for RS failure), and the new entries
for the same region(s) will be pushed by the RS which now serves the region(s), but they push
the hlog entries of a same region concurrently without coordination.
> This treatment can possibly lead to data inconsistency between master and peer clusters:
> 1. there are put and then delete written to master cluster
> 2. due to region-move / RS-failure, they are pushed by different replication-source threads
to peer cluster
> 3. if delete is pushed to peer cluster before put, and flush and major-compact occurs
in peer cluster before put is pushed to peer cluster, the delete is collected and the put
remains in peer cluster
> In this scenario, the put remains in peer cluster, but in master cluster the put is masked
by the delete, hence data inconsistency between master and peer clusters

This message was sent by Atlassian JIRA

View raw message