hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ramkrishna vasudevan <ramkrishna.s.vasude...@gmail.com>
Subject Re: HBase Replication vs Read Replicas
Date Wed, 11 Oct 2017 04:53:02 GMT
Hi Kahlil

Your understanding is right as how HBase replication is across data centres
where as Hbase read replicas are more for providing faster availability for
reads.

>>not be the proper tool to use here since it appears to have higher
replication latency and be more catered towards Disaster Recovery than High
Availability.
Yes you are right here.

Read Replicas how ever am not sure if we can have it across data centres.
You can have your region servers hosted across different racks and the
region replicas are created in such a way that you could have your replica
regions in different racks so even if a rack is down your data can be
served from the other replica regions.


>>My use case is that I have a table I'd like to replicate between data
centers A and B. It is OK if all writes can only go through one data center
(say, A). However, all clients should be able to read from either A or B.
In particular, I'd like for some clients to be able to specifically say
they'd like to read from A and others to say they'd like to read from B,
for any given row key.

To answer this, long back we had feature developed call cross site big
table which allows you to configure two data centres and will allow your
cleint to write to these data centres and as you wanted the reads can be
specifically targetted to the data centre A or B by mentioning that in the
client API. There will be some lag as the replication has to happen but it
allows to manage your writes and reads across clusters using a single
client.

https://www.slideshare.net/HBaseCon/ecosystem-session-3.

Regards
Ram


On Tue, Oct 10, 2017 at 8:25 PM, Kahlil Oppenheimer <
kahliloppenheimer@gmail.com> wrote:

> Hi All,
>
> I have some questions about when to use HBase Replication vs. HBase Read
> Replicas. They seem to accomplish similar-ish things, and I'm trying to
> figure out which I should use.
>
> I've read through the documentation, but I am confused on a few points. It
> seems that HBase Replication can have very high latency for replication (on
> a magnitude of minutes). My application can tolerate a rough maximum of 60s
> of replication latency, so that would be problematic for me.
>
> Read Replicas seem to have quite low (configurable) replication latency,
> but do not seem to lend themselves cross-datacenter replication. For
> instance, having Replica 1 in Datacenter A and Replica 2 in Datacenter B,
> allowing clients to say "Read only from Datacenter A" vs. "Read only from
> Datacenter B".
>
> My use case is that I have a table I'd like to replicate between data
> centers A and B. It is OK if all writes can only go through one data center
> (say, A). However, all clients should be able to read from either A or B.
> In particular, I'd like for some clients to be able to specifically say
> they'd like to read from A and others to say they'd like to read from B,
> for any given row key.
>
> It is also OK if the data coming from one of these reads can be stale, so
> long as it is no more than 60s stale, and that the client has some
> indication that the data may not be up to date.
>
> Because of the 60s stale constraint, it seems like HBase Replication may
> not be the proper tool to use here since it appears to have higher
> replication latency and be more catered towards Disaster Recovery than High
> Availability.
>
> Read Replicas seem like the proper solution here, but the Timeline
> consistency model doesn't seem to let me say "Read from datacenter B", it
> just says "Try to read from all data-centers and return B if it gets back
> first". Furthermore, it doesn't seem intuitive to force the region replicas
> to be hosted on datacenter B.
>
> What would you all recommend? Am I misunderstanding either of these HBase
> features, or is there a more intuitive feature of HBase I should reference
> to solve this problem?
>
> For what it's worth, I'm running the CDH-5.9-1.2.0 version of HBase.
>
> Many thanks,
> Kahlil
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message