hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@yahoo.com>
Subject Re: HBase Replication use cases
Date Fri, 13 Apr 2012 00:50:55 GMT
I have a new use case that will involve replication intercontinentally, between two EC2 regions.
Using 0.94. It will only be a proof of concept but might shake out something. I will also
have an economic incentive to minimize transfer. 

Best regards,

    - Andy

On Apr 13, 2012, at 5:50 AM, lars hofhansl <lhofhansl@yahoo.com> wrote:

> Thanks Himanshu,
> we're planning to use Replication for cross DC replication for DR (and we added a bunch
of stuff and fixed bugs in replication).
> We'll have it always on (and only use stop/start_peer, which is new in 0.94+ to temporarily
stop replication, rather than stop/start_replication)
> HBASE-2611 is a problem. We did not have time recently to work on this.
> i) and ii) can be worked around by forcing a log roll on all region servers after replication
was enabled. Replication would be considered started after the logs were
> rolled... But that is quite annoying.
> Is iii) still a problem in 0.92+? I thought we fixed that together with a).
> -- Lars
> ________________________________
> From: Himanshu Vashishtha <hvashish@cs.ualberta.ca>
> To: dev@hbase.apache.org 
> Sent: Thursday, April 12, 2012 12:11 PM
> Subject: HBase Replication use cases
> Hello All,
> I have been doing testing on the HBase replication (0.90.4, and 0.92 variants).
> Here are some of the findings:
> a) 0.90+ is not that great in handling out znode changes; in an
> ongoing replication, if I delete a peer and a region server goes to
> the znode to update the log status, the region server aborts itself
> when it sees a missing znode.
> Recoverable Zookeeper seems to have fix this in 0.92+?
> 0.92 has lot of new features (start/stop handle, master master, cyclic).
> But there are corner cases with the start/stop switches.
> i)  A log is en-queue when the replication state is set to true. When we
> start the cluster, it is true and the starting region server takes the
> new log into the queue. If I do a stop_replication, and there is a log
> roll, and then I do a start_replication, the current log will not be
> replicated, as it has missed the opportunity of being added to the queue.
> ii) If I _start_ a region server when the replication state is set to
> false, its log will not be added to the queue. Now, if I do a
> start_replication, its log will not be replicated.
> iii) Removing a peer doesn't result in master region server abort, but
> in case of zk is down and there is a log roll, it will abort. Not a
> serious one as zk is down so the cluster is not healthy anyway.
> I was looking for jiras (including 2611), and stumbled upon 2223. I
> don't think there is any thing like time based partition behavior (as
> mentioned in the jira description). Though. the patch has lot of other
> nice things which indeed are in existing code. Please correct me if I
> miss  anything.
> Having said that, I wonder about other folks out there use it.
> Their experience, common issues (minor + major) they come across.
> I did find a ppt by Jean Daniel at oscon mentioning about using it in
> SU production.
> I plan to file jiras for the above ones and will start digging in.
> Look forward for your responses.
> Thanks,
> Himanshu

View raw message