hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10070) HBase read high-availability using eventually consistent region replicas
Date Fri, 17 Jan 2014 23:45:30 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13875417#comment-13875417

stack commented on HBASE-10070:

Ok on the timing.  You know how I feel about 1.0 -- sooner rather than later -- but hopefully
this feature gets done in time.

Looking at HBASE-10347, I have a 'design level' concern so let me raise it here rather than
there.  Let me repeat a comment I made there:

After thinking more on this, I 'get' why you have the replicas listed inside in the row rather
than as rows themselves [in hbase:meta].  The row in hbase:meta becomes a proxy or facade
for the little cluster of regions one of which is the primary with the others read replicas.
 If that is the case, lets recognize it as so and make proper accommodation in the code base
and model.

Problems I see are:

+ HRegionInfo now is overloaded.  Before it was the info on a specific region.  Now it is
trying to serve two purposes; its original intent and now too as a descriptor on the region-serving
'cluster' made of a primary and replicas.  Lets avoid overloading what up to this has had
a clear role in the hbase model.
+ The primary holds the 'pole position' being the name of the region in meta.  The read replicas
are differently named with the 00001 and 00002, etc., interpolated into the middle of the
region name.  I suppose doing it this way 'minimizes' the disturbance in the code base but
I'm worried this naming exception will only confuse though it minimizes change.  Why would
the primary not be named like the replica regions?  

On the latter I can hear a reply that goes, "For those who do not need read replicas, then
they will be unaffected", which I would counter ensures that this feature will forever be
ghetto and no one will use it because it unexercised.

Trying to ensure that we do not paint ourselves into a corner and to avoid the ghetto, looking
beyond read replicas to full-on quorum read/writes, I can imagine we'd need some means like
the above where the hbase:meta row name is not longer the physical name of a region but rather
a logical name.  The primary region in the quorum in read replicas is the region number 00000
but doing quorum read/writes, the leader will need to be able to change over the life of the

Going forward, all regions get an index?  By default the index is zero.  When replicas or
quorum members, the indices distingush members.  When read replicas the region with index
0 is primary.  When a quorum, the index has no special meaning.  In the past we have had two
naming conventions for regions live side by side in the one live cluster.  We could do it

> HBase read high-availability using eventually consistent region replicas
> ------------------------------------------------------------------------
>                 Key: HBASE-10070
>                 URL: https://issues.apache.org/jira/browse/HBASE-10070
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>         Attachments: HighAvailabilityDesignforreadsApachedoc.pdf
> In the present HBase architecture, it is hard, probably impossible, to satisfy constraints
like 99th percentile of the reads will be served under 10 ms. One of the major factors that
affects this is the MTTR for regions. There are three phases in the MTTR process - detection,
assignment, and recovery. Of these, the detection is usually the longest and is presently
in the order of 20-30 seconds. During this time, the clients would not be able to read the
region data.
> However, some clients will be better served if regions will be available for reads during
recovery for doing eventually consistent reads. This will help with satisfying low latency
guarantees for some class of applications which can work with stale reads.
> For improving read availability, we propose a replicated read-only region serving design,
also referred as secondary regions, or region shadows. Extending current model of a region
being opened for reads and writes in a single region server, the region will be also opened
for reading in region servers. The region server which hosts the region for reads and writes
(as in current case) will be declared as PRIMARY, while 0 or more region servers might be
hosting the region as SECONDARY. There may be more than one secondary (replica count >
> Will attach a design doc shortly which contains most of the details and some thoughts
about development approaches. Reviews are more than welcome. 
> We also have a proof of concept patch, which includes the master and regions server side
of changes. Client side changes will be coming soon as well. 

This message was sent by Atlassian JIRA

View raw message