hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Feng Honghua (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-9501) Provide throttling for replication
Date Fri, 07 Feb 2014 08:47:19 GMT

    [ https://issues.apache.org/jira/browse/HBASE-9501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13894317#comment-13894317
] 

Feng Honghua commented on HBASE-9501:
-------------------------------------

bq.How do you feel about using a class instead of a static method? IIUC you could push the
two AtomicLongs in there, move the unit test out of TestReplicationSmallTests (so that you
don't have to pay the price for setUp()) and have meaningful method names instead of lines
like this one
==> done

bq.The InterruptedException should be caught, if something told ReplicationSource to stop
then we shouldn't ignore it and try to continue shipping edits
==> what about adding log and interrupting the current thread? same problem with ReplicationSource.sleepForRetries().
and seems there are various kinds of handling for sleep's InterruptedException in HBase, some
ignores, some don't catch and delegate to upper callers...

> Provide throttling for replication
> ----------------------------------
>
>                 Key: HBASE-9501
>                 URL: https://issues.apache.org/jira/browse/HBASE-9501
>             Project: HBase
>          Issue Type: Improvement
>          Components: Replication
>            Reporter: Feng Honghua
>            Assignee: Feng Honghua
>         Attachments: HBASE-9501-trunk_v0.patch, HBASE-9501-trunk_v1.patch, HBASE-9501-trunk_v2.patch,
HBASE-9501-trunk_v3.patch
>
>
> When we disable a peer for a time of period, and then enable it, the ReplicationSource
in master cluster will push the accumulated hlog entries during the disabled interval to the
re-enabled peer cluster at full speed.
> If the bandwidth of the two clusters is shared by different applications, the push at
full speed for replication can use all the bandwidth and severely influence other applications.
> Though there are two config replication.source.size.capacity and replication.source.nb.capacity
to tweak the batch size each time a push delivers, but if decrease these two configs, the
number of pushes increase, and all these pushes proceed continuously without pause. And no
obvious help for the bandwidth throttling.
> From bandwidth-sharing and push-speed perspective, it's more reasonable to provide a
bandwidth up limit for each peer push channel, and within that limit, peer can choose a big
batch size for each push for bandwidth efficiency.
> Any opinion?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message