hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "cuijianwei (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-10598) Written data can not be read out because MemStore#timeRangeTracker might be updated concurrently
Date Mon, 24 Feb 2014 10:03:20 GMT
cuijianwei created HBASE-10598:

             Summary: Written data can not be read out because MemStore#timeRangeTracker might
be updated concurrently
                 Key: HBASE-10598
                 URL: https://issues.apache.org/jira/browse/HBASE-10598
             Project: HBase
          Issue Type: Bug
          Components: regionserver
    Affects Versions: 0.94.16
            Reporter: cuijianwei

In our test environment, we found that written data can't be read out occasionally. After
debugging, we find that maximumTimestamp/minimumTimestamp of MemStore#timeRangeTracker might
decrease/increase when MemStore#timeRangeTracker is updated concurrently, which might make
the MemStore/StoreFile to be filtered incorrectly when reading data out. Let's see how the
concurrent updating of timeRangeTracker#maximumTimestamp cause this problem. 
Imagining there are two threads T1 and T2 putting two KeyValues kv1 and kv2. kv1 and kv2 belong
to the same Store(the same region), but contain different rowkeys. Consequently, kv1 and kv2
could be updated concurrently. When we see the implementation of HRegionServer#multi, kv1
and kv2 will be add to MemStore by HRegion#doMiniBatchMutation#applyFamilyMapToMemstore. Then,
MemStore#internalAdd will be invoked and MemStore#timeRangeTracker will be updated by TimeRangeTracker#includeTimestamp
as follows:
  private void includeTimestamp(final long timestamp) {
    else if (maximumTimestamp < timestamp) {
      maximumTimestamp = timestamp;
Imagining the current maximumTimestamp is t0 before includeTimestamp invoked, kv1.timestamp=t1,
 kv2.timestamp=t2, t1 and t2 are both set by user(then, user knows the timestamp of kv1 and
kv2), and t1 > t2. T1 and T2 will be executed concurrently, therefore, the two threads
might both find the current maximumTimestamp is less than the timestamp of its kv. After that,
T1 and T2 will both set maximumTimestamp to timestamp of its kv. If T1 set maximumTimestamp
before T2 doing that, the maximumTimestamp will be set to t2. Then, before any new update
with bigger timestamp has been applied to the MemStore, if we try to read out kv1 by HTable#get
and set the timestamp of 'Get' to t1, the StoreScanner will decide whether the MemStoreScanner(imagining
kv1 has not been flushed) should be selected as candidate scanner by the method MemStoreScanner#shouldUseScanner.
The MemStore won't be selected because maximumTimestamp of the MemStore has been set to t2
(t2 < t1). Consequently, the written kv1 can't be read out and kv1 is lost from user's
If the analysis of above is right, after maximumTimestamp of MemStore#timeRangeTracker has
been set to t2, user will experience data lass in the following situations:
1. Before any new write with kv.timestamp > t1 has been add to the MemStore, read request
of kv1 with timestamp=t1 can not read kv1 out.
2. Before any new put with kv.timestamp > t1 has been add to the MemStore, if a flush happened,
the data of MemStore will be flushed to StoreFile with StoreFile#maximumTimestamp set to t2.
After that, any read request with timestamp=t2 can not read kv1 before next compaction(the
content of StoreFile won't change and kv1.timestamp might also not be included even after
The second situation is much more serious because the incorrect timeRange of MemStore has
been persisted to the file. And Similarly, the concurrent update of TimeRangeTracker#minimumTimestamp
may also cause this problem.
As a simple way to fix the problem, we could add synchronized to TimeRangeTracker#includeTimestamp
so that this method won't be invoked concurrently.

This message was sent by Atlassian JIRA

View raw message