Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Tue, 16 Dec 2014 10:41:16 +0000 (UTC)
From: "Liu Shaohui (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.12759415.1417697471000.32183.1418726476577@Atlassian.JIRA>
In-Reply-To: <JIRA.12759415.1417697471000@Atlassian.JIRA>
References: <JIRA.12759415.1417697471000@Atlassian.JIRA>
 <JIRA.12759415.1417697471853@arcas>
Subject: [jira] [Commented] (HBASE-12636) Avoid too many write operations on
 zookeeper in replication
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HBASE-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248100#comment-14248100 ] 

Liu Shaohui commented on HBASE-12636:
-------------------------------------

[~lhofhansl] [~stack]
Repeated replication data will happen in many cases in current codebase.
- The network problem which make master cluster did not get response of replication but the replication data has been written into in slave cluster.
- A moving region in salve cluster which make the replication failed, but part of replication data has been into in slave cluster.

This patch just make repeated replication data more frequently.

We may open another issue to check if the replication operation is idempotent?

> Avoid too many write operations on zookeeper in replication
> -----------------------------------------------------------
>
>                 Key: HBASE-12636
>                 URL: https://issues.apache.org/jira/browse/HBASE-12636
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.94.11
>            Reporter: Liu Shaohui
>            Assignee: Liu Shaohui
>              Labels: replication
>             Fix For: 1.0.0
>
>         Attachments: HBASE-12635-v2.diff, HBASE-12636-v1.diff
>
>
> In our production cluster, we found there are about over 1k write operations per second on zookeeper from hbase replication. The reason is that the replication source will write the log position to zookeeper for every edit shipping. If the current replicating WAL is just the WAL that regionserver is writing to,  each skipping will be very small but the frequency is very high, which causes many write operations on zookeeper.
> A simple solution is that writing log position to zookeeper when position diff or skipped edit number is larger than a threshold, not every  edit shipping.
> Suggestions are welcomed, thx~


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)