hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tianying Chang <tych...@gmail.com>
Subject Hbase replication perf issue
Date Wed, 06 Aug 2014 18:33:19 GMT

We are seeing some performance issue on one of our write heavy cluster, and
trying to find out the root cause. One confusion I have during investigate
is that I found in ReplicationSink.java, it says this which seems wrong:

 * This class is responsible for replicating the edits coming
 * from another cluster.
 * <p/>
 * This replication process is currently waiting for the edits to be applied
 * before the method can return. This means that the replication of edits
 * is synchronized (after reading from HLogs in ReplicationSource) and that
 * single region server cannot receive edits from two sources at the same
 * <p/>
 * This class uses the native HBase client in order to replicate entries.
 * <p/>

I think replicateLogEntries() is a public API provided by HRegionserver, if
two sources picked the same sink and sent their requests at the same time,
each of them should be dequeued by a free thread hander on that
RegionServer, and being processed in parallel. How can it achieve the goal
stated in the comments?

Am I missing something here? In summary, two questions:

1. How can it prevent two sources from invoking replicateLogEntries() at
the same time?
2. What is the concern if it is true intention that the author want to
prevent two sources from invoking replicateLogEntries() at the same time? I
think with timestamp in the put, it should not worry about the order.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message