hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Boudnik (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10070) HBase read high-availability using timeline-consistent region replicas
Date Wed, 21 May 2014 01:19:43 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004184#comment-14004184

Konstantin Boudnik commented on HBASE-10070:

Can I ask what definition of *timeline consistency* are we using here? I am a little bit uncomfortable
with it as I don't know what exactly is implied by this term.

To explain further on this: in case of a consensus based replication (described in the Design
Document attached to HBASE-10909) we are claiming that all writable active replicas are *one
copy equivalent* or strong consistency across replication that reached the same GSN. In case
of this JIRA, the *strong consistency* with just a single writable replica (and no RO slaves)
has the same semantic. I believe by providing a pluggable fail-over policy implementation
we will guarantee that *strong consistency* in case of a consensus based replication has the
same semantical meaning as in case of HBASE-10070. In other words, we'll provide the implementation
of the semantic instead of a documentation of a such.

Relaying to an earlier Stack's comment:
bq. Pardon all the questions. I am concerned that a prime directive, consistent view, is being
softened. As is, its easy saying what we are. Going forward, lets not get to a spot where
we have to answer "It is complicated..." when asked if we are a consistent store or not
shall we try to provide a harder consistency guarantees, while covering the weaker ones en-route?

> HBase read high-availability using timeline-consistent region replicas
> ----------------------------------------------------------------------
>                 Key: HBASE-10070
>                 URL: https://issues.apache.org/jira/browse/HBASE-10070
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>         Attachments: HighAvailabilityDesignforreadsApachedoc.pdf
> In the present HBase architecture, it is hard, probably impossible, to satisfy constraints
like 99th percentile of the reads will be served under 10 ms. One of the major factors that
affects this is the MTTR for regions. There are three phases in the MTTR process - detection,
assignment, and recovery. Of these, the detection is usually the longest and is presently
in the order of 20-30 seconds. During this time, the clients would not be able to read the
region data.
> However, some clients will be better served if regions will be available for reads during
recovery for doing eventually consistent reads. This will help with satisfying low latency
guarantees for some class of applications which can work with stale reads.
> For improving read availability, we propose a replicated read-only region serving design,
also referred as secondary regions, or region shadows. Extending current model of a region
being opened for reads and writes in a single region server, the region will be also opened
for reading in region servers. The region server which hosts the region for reads and writes
(as in current case) will be declared as PRIMARY, while 0 or more region servers might be
hosting the region as SECONDARY. There may be more than one secondary (replica count >
> Will attach a design doc shortly which contains most of the details and some thoughts
about development approaches. Reviews are more than welcome. 
> We also have a proof of concept patch, which includes the master and regions server side
of changes. Client side changes will be coming soon as well. 

This message was sent by Atlassian JIRA

View raw message