Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DD4B490D6 for ; Tue, 16 Dec 2014 10:41:16 +0000 (UTC) Received: (qmail 93572 invoked by uid 500); 16 Dec 2014 10:41:16 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 93525 invoked by uid 500); 16 Dec 2014 10:41:16 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 93514 invoked by uid 99); 16 Dec 2014 10:41:16 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Dec 2014 10:41:16 +0000 Date: Tue, 16 Dec 2014 10:41:16 +0000 (UTC) From: "Liu Shaohui (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-12636) Avoid too many write operations on zookeeper in replication MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248100#comment-14248100 ] Liu Shaohui commented on HBASE-12636: ------------------------------------- [~lhofhansl] [~stack] Repeated replication data will happen in many cases in current codebase. - The network problem which make master cluster did not get response of replication but the replication data has been written into in slave cluster. - A moving region in salve cluster which make the replication failed, but part of replication data has been into in slave cluster. This patch just make repeated replication data more frequently. We may open another issue to check if the replication operation is idempotent? > Avoid too many write operations on zookeeper in replication > ----------------------------------------------------------- > > Key: HBASE-12636 > URL: https://issues.apache.org/jira/browse/HBASE-12636 > Project: HBase > Issue Type: Improvement > Affects Versions: 0.94.11 > Reporter: Liu Shaohui > Assignee: Liu Shaohui > Labels: replication > Fix For: 1.0.0 > > Attachments: HBASE-12635-v2.diff, HBASE-12636-v1.diff > > > In our production cluster, we found there are about over 1k write operations per second on zookeeper from hbase replication. The reason is that the replication source will write the log position to zookeeper for every edit shipping. If the current replicating WAL is just the WAL that regionserver is writing to, each skipping will be very small but the frequency is very high, which causes many write operations on zookeeper. > A simple solution is that writing log position to zookeeper when position diff or skipped edit number is larger than a threshold, not every edit shipping. > Suggestions are welcomed, thx~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)