hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Questions about HBase replication
Date Tue, 16 Dec 2014 16:33:34 GMT
To add to what Jieshan said:

On Fri, Nov 21, 2014 at 8:32 PM, Cliff <cliffcheng411@gmail.com> wrote:

> 1.
> Why does "HBase replication" need replicationSink?
> I think replicationSource can do replicationSink's work as well.
> And if we don't use replicationSink, we just need one time I/O.

If you were to use HTable from the source:

- All your meta lookups would be a lot slower than if you were local. We
rely on this to be extremely fast.

- You would be sending at least as many RPCs, but probably more since
you'll be sending them directly to each region server on the slave side,
chunked up by table. More, tinier RPCs probably isn't what you want over

 - BTW sending one big batch can also make RPC compression more efficient.

- Retries would be done over the WAN. For example, you're regularly sending
2MB batches to a region and then it moves. The first batch that gets sent
after the move will go to where you think the region is, only to get a
NSRE. You'll then do a meta lookup to find the new location, again over the
WAN, and send those 2MBs again to the new location. It's a lot of back and
forth you'd rather do in a LAN.

Hope this helps,


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message