hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashu Pachauri (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-15001) Thread Safety issues in ReplicationSinkManager and HBaseInterClusterReplicationEndpoint
Date Thu, 17 Dec 2015 21:28:46 GMT
Ashu Pachauri created HBASE-15001:

             Summary: Thread Safety issues in ReplicationSinkManager and HBaseInterClusterReplicationEndpoint
                 Key: HBASE-15001
                 URL: https://issues.apache.org/jira/browse/HBASE-15001
             Project: HBase
          Issue Type: Bug
          Components: Replication
    Affects Versions: 2.0.0, 1.2.0, 1.3.0, 1.2.1
            Reporter: Ashu Pachauri
            Assignee: Ashu Pachauri
            Priority: Critical

ReplicationSinkManager is not thread-safe. This can cause problems in HBaseInterClusterReplicationEndpoint,
 when the walprovider is multiwal. 
For example: 
1. When multiple threads report bad sinks, the sink list can be non-empty but report a negative
size because the ArrayList itself is not thread-safe. 

2. HBaseInterClusterReplicationEndpoint depends on the number of sinks to batch edits for
shipping. However, it's quite possible that the following code makes it assume that there
are no batches to process (sink size is non-zero, but by the time we reach the "batching"
part, sink size becomes zero.)
if (replicationSinkMgr.getSinks().size() == 0) {
    return false;
int n = Math.min(Math.min(this.maxThreads, entries.size()/100+1),
This is very dangerous, because, assuming no batches to process, we can safely report that
we replicated successfully, while we actually did not replicate anything. 

The idea is to make all operations in ReplicationSinkManager thread-safe and do a verification
on the size of replicated edits before we report success.

This message was sent by Atlassian JIRA

View raw message