hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "cuijianwei (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10598) Written data can not be read out because MemStore#timeRangeTracker might be updated concurrently
Date Tue, 25 Feb 2014 02:46:22 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13911161#comment-13911161
] 

cuijianwei commented on HBASE-10598:
------------------------------------

Thanks for your comments [~enis], [~stack] and [~xieliang007]. Becuase AtomicLong#compareAndSet
can only update value based on equivalent judgement, we might need the following code to update
maximumTimestamp in TimeRangeTracker#includeTimestamp if defining it as AtomicLong.
{code}
AtomicLong maximumTimestamp = new AtomicLong(-1);
...
  private void includeTimestamp(final long timestamp) {
    ....
    long lastTimestamp = maximumTimestamp.get();
    while (lastTimestamp < timestamp && maximumTimestamp.compareAndSet(lastTimestamp,
timestamp)) {
      lastTimestamp = maximumTimestamp.get();
    }
  }
{code}
Will there be a simple way to use AtomicLong to do the logic in this method? On the other
hand, if we add synchronized to TimeRangeTracker#includeTimestamp, we also need to make the
data members volatile as [~stack] suggested. 

> Written data can not be read out because MemStore#timeRangeTracker might be updated concurrently
> ------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-10598
>                 URL: https://issues.apache.org/jira/browse/HBASE-10598
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.94.16
>            Reporter: cuijianwei
>            Assignee: cuijianwei
>            Priority: Critical
>         Attachments: HBASE-10598-0.94.v1.patch
>
>
> In our test environment, we find written data can't be read out occasionally. After debugging,
we find that maximumTimestamp/minimumTimestamp of MemStore#timeRangeTracker might decrease/increase
when MemStore#timeRangeTracker is updated concurrently, which might make the MemStore/StoreFile
to be filtered incorrectly when reading data out. Let's see how the concurrent updating of
timeRangeTracker#maximumTimestamp cause this problem. 
> Imagining there are two threads T1 and T2 putting two KeyValues kv1 and kv2. kv1 and
kv2 belong to the same Store(so belong to the same region), but contain different rowkeys.
Consequently, kv1 and kv2 could be updated concurrently. When we see the implementation of
HRegionServer#multi, kv1 and kv2 will be add to MemStore by HRegion#applyFamilyMapToMemstore
in HRegion#doMiniBatchMutation. Then, MemStore#internalAdd will be invoked and MemStore#timeRangeTracker
will be updated by TimeRangeTracker#includeTimestamp as follows:
> {code}
>   private void includeTimestamp(final long timestamp) {
>      ...
>     else if (maximumTimestamp < timestamp) {
>       maximumTimestamp = timestamp;
>     }
>     return;
>   }
> {code}
> Imagining the current maximumTimestamp of TimeRangeTracker is t0 before includeTimestamp(...)
invoked, kv1.timestamp=t1,  kv2.timestamp=t2, t1 and t2 are both set by user(then, user knows
the timestamps of kv1 and kv2), and t1 > t2 > t0. T1 and T2 will be executed concurrently,
therefore, the two threads might both find the current maximumTimestamp is less than the timestamp
of its kv. After that, T1 and T2 will both set maximumTimestamp to timestamp of its kv. If
T1 set maximumTimestamp before T2 doing that, the maximumTimestamp will be set to t2. Then,
before any new update with bigger timestamp has been applied to the MemStore, if we try to
read out kv1 by HTable#get and set the timestamp of 'Get' to t1, the StoreScanner will decide
whether the MemStoreScanner(imagining kv1 has not been flushed) should be selected as candidate
scanner by MemStoreScanner#shouldUseScanner. Then, the MemStore won't be selected in MemStoreScanner#shouldUseScanner
because maximumTimestamp of the MemStore has been set to t2 (t2 < t1). Consequently, the
written kv1 can't be read out and kv1 is lost from user's perspective.
> If the above analysis is right, after maximumTimestamp of MemStore#timeRangeTracker has
been set to t2, user will experience data lass in the following situations:
> 1. Before any new write with kv.timestamp > t1 has been add to the MemStore, read
request of kv1 with timestamp=t1 can not read kv1 out.
> 2. Before any new write with kv.timestamp > t1 has been add to the MemStore, if a
flush happened, the data of MemStore will be flushed to StoreFile with StoreFile#maximumTimestamp
set to t2. After that, any read request with timestamp=t1 can not read kv1 before next compaction(Actually,
kv1.timestamp might not be included in timeRange of the StoreFile even after compaction).
> The second situation is much more serious because the incorrect timeRange of MemStore
has been persisted to the file. 
> Similarly, the concurrent update of TimeRangeTracker#minimumTimestamp may also cause
this problem.
> As a simple way to fix the problem, we could add synchronized to TimeRangeTracker#includeTimestamp
so that this method won't be invoked concurrently.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message