Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Wed, 29 Jan 2014 17:40:10 +0000 (UTC)
From: "Jean-Daniel Cryans (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.12668014.1378893547743.4122.1391017210912@arcas>
In-Reply-To: <JIRA.12668014.1378893547743@arcas>
References: <JIRA.12668014.1378893547743@arcas>
Subject: [jira] [Commented] (HBASE-9501) Provide throttling for replication
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HBASE-9501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13885575#comment-13885575 ] 

Jean-Daniel Cryans commented on HBASE-9501:
-------------------------------------------

I think the patch should be refactored so that the bandwidth limiter is just a tool you use and that tells you how long you should sleep given some information you provide. This way you don't have to start tracking how much time you've actually slept in the unit tests since this tends to be very unreliable, especially on the build machines.

It will also make the code more readable in ReplicationSource and won't add a lot of runtime to TestReplicationSmallTests.

> Provide throttling for replication
> ----------------------------------
>
>                 Key: HBASE-9501
>                 URL: https://issues.apache.org/jira/browse/HBASE-9501
>             Project: HBase
>          Issue Type: Improvement
>          Components: Replication
>            Reporter: Feng Honghua
>            Assignee: Feng Honghua
>         Attachments: HBASE-9501-trunk_v0.patch, HBASE-9501-trunk_v1.patch
>
>
> When we disable a peer for a time of period, and then enable it, the ReplicationSource in master cluster will push the accumulated hlog entries during the disabled interval to the re-enabled peer cluster at full speed.
> If the bandwidth of the two clusters is shared by different applications, the push at full speed for replication can use all the bandwidth and severely influence other applications.
> Though there are two config replication.source.size.capacity and replication.source.nb.capacity to tweak the batch size each time a push delivers, but if decrease these two configs, the number of pushes increase, and all these pushes proceed continuously without pause. And no obvious help for the bandwidth throttling.
> From bandwidth-sharing and push-speed perspective, it's more reasonable to provide a bandwidth up limit for each peer push channel, and within that limit, peer can choose a big batch size for each push for bandwidth efficiency.
> Any opinion?


--
This message was sent by Atlassian JIRA
(v6.1.5#6160)