hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: Region Splitting for moderate amount of daily data - Improve MapReduce Performance
Date Sun, 17 Apr 2011 19:46:52 GMT

> Andrew, when you say this:
> > Because HBase is a DOT it can provide strongly consistent
> > and atomic operations on rows, because rows exist in only
> > one place at a time.
> This excludes the use of HBase replication?


With the new replication feature of 0.92 edits are streamed from one cluster to another. Row
mutations will be consistent/atomic as they are applied at the target, but of course the replication
stream may lag for a number of reasons. Therefore the row data according to the view of each
cluster may be different. 

> I'm curious as to where HBase replication places the duplicate(?)
> region blocks in HDFS?

The edits are streamed from the WAL. WALs are rolled per usual but are kept perhaps for a
longer period of time; until all of their replication scoped edits have been streamed to the
target cluster.  

> Also currently is there pass the baton failover when a replicated
> region master fails?

Yes. Via mechanisms mediated by ZooKeeper. But J-D could say more here. 

   - Andy

View raw message