hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10070) HBase read high-availability using eventually consistent region replicas
Date Wed, 15 Jan 2014 23:54:27 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13872830#comment-13872830

Enis Soztutar commented on HBASE-10070:

bq. Has anyone asked for this feature on the list or in issues? You have a hard user for this
feature or is this speculative work? If so, is the user/customer looking explicitly to be
able to do stale reads or is giving them stale data a compromise on what they are actually
asking for (Let me guess, HA consistent view). If HA consistent view is what they want, should
we work on that instead?
bq. Should such clients use another data store, one that allows eventually consistent views?
Clients having to deal with sometimes stale data will be more involved
We have a customer use case which is the main driver, but while in the design stage, we are
also having some interest from other prospects as well. The main use case is actually not
HA consistent view. Eventual consistency is a kind of a misnomer actually, the main use case
is to be able to read at least some data back in case of a failover. It is more like a "timeline
consistency" rather than the eventual consistency a-la dynamo. That data might be stale as
long as it is acknowledged as it is, but for serving the data to the web tier, there should
be a better guarantee than our current MTTR story. 
bq. I am concerned that a prime directive, consistent view, is being softened. As is, its
easy saying what we are. Going forward, lets not get to a spot where we have to answer "It
is complicated..." when asked if we are a consistent store or not.
Again, the main semantics is not changed. As per the tradeoffs section, we are trying to add
the flexibility for some tradeoff. Our whole CAP story is not a black-or-white choice. We
are strongly consistent for single row updates, while highly available across regions (an
RS going down does not affect other regions not hosted there), and eventual consistent across-DC.
bq. Could we implement this feature with some minor changes in core and then stuff like clients
that can do the read replica dance done as subclass of current client – a read replica client
– or at a layer above current client?
bq. Seems wrong having RCP consious of replicas. I'd think this managed at higher levels up
in HCM.
[~nkeywal] do you want to chime in? 
bq.  Why notions of secondary and tertiary? Isn't it primary and replica only?
I should revisit wording I guess. I think we are using the terminology primary <=> replicaId
=0, secondaries <=> replicaId > 0, and secondary <=> replicaId = 1,  tertiary
<=> replicaId = 2, etc. 
bq. "Two region replicas cannot be hosted at the same RS (hard)" If RS count is < # of
replicas, this is relaxed I'm sure (hard becomes soft). Hmm... this seems complicated: 
Agreed. I'll update the doc reflecting that. Initially, I though we should do underReplicated
regions kind of thing, but it requires more intrusive changes to AM / LB since we have to
keep these around etc. Now the LB design is simpler in that, we just try to not co-host region
replicas, but if we cannot (in case replication > # RS or # racks etc) we simply assign
the regions anyway. 
bq. This design looks viable to me and the document is of good quality.
Thanks :) 

> HBase read high-availability using eventually consistent region replicas
> ------------------------------------------------------------------------
>                 Key: HBASE-10070
>                 URL: https://issues.apache.org/jira/browse/HBASE-10070
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>         Attachments: HighAvailabilityDesignforreadsApachedoc.pdf
> In the present HBase architecture, it is hard, probably impossible, to satisfy constraints
like 99th percentile of the reads will be served under 10 ms. One of the major factors that
affects this is the MTTR for regions. There are three phases in the MTTR process - detection,
assignment, and recovery. Of these, the detection is usually the longest and is presently
in the order of 20-30 seconds. During this time, the clients would not be able to read the
region data.
> However, some clients will be better served if regions will be available for reads during
recovery for doing eventually consistent reads. This will help with satisfying low latency
guarantees for some class of applications which can work with stale reads.
> For improving read availability, we propose a replicated read-only region serving design,
also referred as secondary regions, or region shadows. Extending current model of a region
being opened for reads and writes in a single region server, the region will be also opened
for reading in region servers. The region server which hosts the region for reads and writes
(as in current case) will be declared as PRIMARY, while 0 or more region servers might be
hosting the region as SECONDARY. There may be more than one secondary (replica count >
> Will attach a design doc shortly which contains most of the details and some thoughts
about development approaches. Reviews are more than welcome. 
> We also have a proof of concept patch, which includes the master and regions server side
of changes. Client side changes will be coming soon as well. 

This message was sent by Atlassian JIRA

View raw message