hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Devaraj Das <d...@hortonworks.com>
Subject Re: [PROPOSAL] HBASE-10070 branch
Date Wed, 15 Jan 2014 20:51:06 GMT
Some responses inline. Thanks for the inputs.

On Wed, Jan 15, 2014 at 11:17 AM, Stack <stack@duboce.net> wrote:
> On Wed, Jan 15, 2014 at 12:44 AM, Enis Söztutar <enis@hortonworks.com>wrote:
>> Hi,
>> I just wanted to give some updates on the HBASE-10070 efforts from the
>> technical side, and development side, and propose a branch.
>> From the technical side:
>> The changes for region replicas phase 1 are becoming more mature and
>> stable, and most of the "base" changes are starting to become good
>> candidates for review. The code has been rebased to trunk, and the main
>> working repo has been moved to the HBASE-10070 branch at
>> https://github.com/enis/hbase/tree/hbase-10070.
>> An overview of the changes that is working include:
>>  - HRegionInfo & MetaReader & MetaEditor changes for support region
>> replicas
>>  - HTableDescriptor changes and shell changes for supporting
>>  - WebUI changes to display whether a region is a replica or not
>>  - AssignmentManager changes coupled with RegionStates & Master changes to
>> create and assign replicas, alter table, enable table, etc support.
> Thanks for the writeup.
> I am late to the game so take my comments w/ a grain of salt -- I'll take a
> look at HBASE-10070 -- but high-level do we have to go the read replicas
> route?  IMO, having our current already-strained AssignmentManager code
> base manage three replicas instead of one will ensure that Jimmy Xiang and
> Jeffrey Zhong do nothing else for the next year or two but work on the new
> interesting use cases introduced by this new level of complexity put upon a
> system that has just achieved a hard-won stability.

Stack, the model is that the replicas (HRegionInfo with an added field
'replicaId') are treated just as any other region in the AM. You can
see the code - it's not adding much at all in terms of new code to
handle replicas.

> A few of us chatting offline -- Jimmy, Jon, Elliott, and I -- were
> wondering if you couldn't solve this read replicas in a more hbase 'native'
> way* by just bringing up three tables -- a main table and then two snapshot
> clones with the clones refreshed on a period (via snapshot or via
> in-cluster replication) --  and then a shim on top of an HBase client would
> read from the main table until failure and then from a snapshot until the
> main came back.  Reads from snapshot tables could be marked 'stale'.  You'd
> have to modify the balancer so the tables -- or at least their regions --
> were physically distinct... you might be able just have the three tables
> each in a different namespace.

At a high level, considering all the work that would be needed in the
client (for it to be able to be aware of the primary and the snapshot
regions) and in the master (to do with managing the placements of the
regions), I am not convinced. Also, consider that you will be taking a
lot of snapshots and adding to the filesystem's load for the file

> Or how much more work would it take to follow the route our Facebook
> brothers and sisters have taken doing quorum reads and writes incluster?

If you talking about Facebook's work that is talked about in
HBASE-7509, the quorum reads is something that we will benefit from,
and that will help the filesystem side of the story, but we still need
multiple (redundant) regions for the hbase side. If a region is not
reachable, the client could go to another replica for the region...

> * When I say 'native' way in the above, what I mean by this is that HBase
> has always been about giving clients a 'consistent' view -- at least when
> the query is to the source cluster.  Introducing talk and APIs that talk of
> 'eventual consistency' muddies our story.

As we have discussed in the jira, there are use cases. And it's
optional - all the APIs provide 'consistency' by default (status quo).

>> These are some of the remaining things that we are currently working on:
>>  - RPC failover support for multi-gets
>>  - RPC failover support for scans
>>  - RPC cancellation
> This all sounds great.  I was sort of hoping we wouldn't have to do stuff
> like cancellation ourselves though.  Was hoping we could take on an already
> done 'rpc' engine that did this kind of stuff for us.
> ...
>> Development side:
>> As discussed in the issue design doc
>> https://issues.apache.org/jira/secure/attachment/12616659/HighAvailabilityDesignforreadsApachedoc.pdf
>> "Apache
>> code development process" section, at this time we would like to
>> propose:
>>  (1) Creation of HBASE-10070 branch in svn which will be a fork of trunk as
>> of the date branch is created. All of the target authors (me, Devaraj,
>> Nicolas, Sergey) are already committers. I do not remember whether our
>> bylaws require votes on creating branches.
> We don't have bylaws.  It is my understanding that any committer can freely
> make branches and I see nothing wrong w/ this.


>>  (2) The branch will only contain commits that have been reviewed and +1'ed
>> from 2 other committers other than the patch author. Every commit in this
>> branch will have a single patch (maybe with unforeseen addendums) and and
>> associated jira which is a subtask of HBASE-10070.
> OK.
>>  (3) We will use the branch HBASE-10070 hosted at my github repo
>> https://github.com/enis/hbase/tree/hbase-10070 as a working branch with
>> semi-dirty history and "this branch might eat your hard drive" guarantees.
>>  (4) All code contributions / review will be welcome as always. I can give
>> you push perms to the github branch if you are interested in contributing.
>>  (5) Once we have HBASE-10070 Phase 1 tasks done (as described in the doc),
>> we will put up a VOTE to merge the branch in. We will require 3 +1's for
>> the merge in. If we can get early reviews the merge vote will be much less
>> pain since the branch will be in a clean state and there have been reviews
>> per patch. We might need a final rebase, but that should not cause major
>> work I imagine.
>> We are hoping this will be a nice way to develop and deliver the feature to
>> the trunk, but as always all suggestions, comments welcome.
> All above sounds good.  Let me go look at what is there in HBASE-10070.

Thanks, please have a look at the code. It's not scary at all :-)

> St.Ack

NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

View raw message