Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 9CCF4200B92 for ; Wed, 28 Sep 2016 22:52:22 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 9B577160AD3; Wed, 28 Sep 2016 20:52:22 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id DC493160AB8 for ; Wed, 28 Sep 2016 22:52:21 +0200 (CEST) Received: (qmail 69446 invoked by uid 500); 28 Sep 2016 20:52:21 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 69408 invoked by uid 99); 28 Sep 2016 20:52:20 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Sep 2016 20:52:20 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id B13F62C2AB7 for ; Wed, 28 Sep 2016 20:52:20 +0000 (UTC) Date: Wed, 28 Sep 2016 20:52:20 +0000 (UTC) From: "Vincent Poon (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-9465) Push entries to peer clusters serially MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 28 Sep 2016 20:52:22 -0000 [ https://issues.apache.org/jira/browse/HBASE-9465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15530835#comment-15530835 ] Vincent Poon commented on HBASE-9465: ------------------------------------- I think you need the code addition above 'continue" here - if readAllEntries... returns true, then lastPositionsForSerialScope should always be empty? {code} if (readAllEntriesToReplicateOrNextFile(currentWALisBeingWrittenTo, entries, lastPositionsForSerialScope)) { for (Map.Entry entry : lastPositionsForSerialScope.entrySet()) { waitingUntilCanPush(entry); } try { MetaTableAccessor .updateReplicationPositions(manager.getConnection(), actualPeerId, lastPositionsForSerialScope); } catch (IOException e) { LOG.error("updateReplicationPositions fail", e); stopper.stop("updateReplicationPositions fail"); } continue; } {code} > Push entries to peer clusters serially > -------------------------------------- > > Key: HBASE-9465 > URL: https://issues.apache.org/jira/browse/HBASE-9465 > Project: HBase > Issue Type: New Feature > Components: regionserver, Replication > Affects Versions: 2.0.0, 1.4.0 > Reporter: Honghua Feng > Assignee: Phil Yang > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-9465-branch-1-v1.patch, HBASE-9465-branch-1-v1.patch, HBASE-9465-branch-1-v2.patch, HBASE-9465-branch-1-v3.patch, HBASE-9465-branch-1-v4.patch, HBASE-9465-branch-1-v4.patch, HBASE-9465-v1.patch, HBASE-9465-v2.patch, HBASE-9465-v2.patch, HBASE-9465-v3.patch, HBASE-9465-v4.patch, HBASE-9465-v5.patch, HBASE-9465-v6.patch, HBASE-9465-v6.patch, HBASE-9465-v7.patch, HBASE-9465-v7.patch, HBASE-9465.pdf > > > When region-move or RS failure occurs in master cluster, the hlog entries that are not pushed before region-move or RS-failure will be pushed by original RS(for region move) or another RS which takes over the remained hlog of dead RS(for RS failure), and the new entries for the same region(s) will be pushed by the RS which now serves the region(s), but they push the hlog entries of a same region concurrently without coordination. > This treatment can possibly lead to data inconsistency between master and peer clusters: > 1. there are put and then delete written to master cluster > 2. due to region-move / RS-failure, they are pushed by different replication-source threads to peer cluster > 3. if delete is pushed to peer cluster before put, and flush and major-compact occurs in peer cluster before put is pushed to peer cluster, the delete is collected and the put remains in peer cluster > In this scenario, the put remains in peer cluster, but in master cluster the put is masked by the delete, hence data inconsistency between master and peer clusters -- This message was sent by Atlassian JIRA (v6.3.4#6332)