hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Replication sink selection strategy
Date Tue, 12 Feb 2013 21:14:10 GMT
Hey Gabriel,

I think when I originally designed it I over-engineered it a bit. Just
picking a random one should be enough and make the code simpler.

J-D

On Tue, Feb 12, 2013 at 8:37 AM, Gabriel Reid <gabrielr@ngdata.com> wrote:
> Hi,
>
> I was wondering if someone (perhaps Jean-Daniel, but anyone is welcome) could explain
the reasoning for the current peer sink selection logic within replication.
>
> As it currently stands, a percentage (by default 10%) of the slave cluster's region servers
are randomly chosen by each region server in the master cluster as their replication pool.
Each time a batch of edits is shipped to a peer, one region server is chosen from the pre-selected
pool of slave region servers.
>
> I was wondering what the advantage(s) of this approach are compared to each master region
server simply randomly choosing a slave peer from the full set of slave region servers. In
my (probably naive) view, this approach would provide a more even distribution of usage over
the whole slave cluster, and I can't see any real advantages that the current approach has
(although I assume there must be some).
>
> Could someone let me know what the reasoning is behind the current approach?
>
> Thanks,
>
> Gabriel

Mime
View raw message