Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4233D105EF for ; Tue, 4 Mar 2014 20:47:54 +0000 (UTC) Received: (qmail 25386 invoked by uid 500); 4 Mar 2014 20:47:48 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 25334 invoked by uid 500); 4 Mar 2014 20:47:46 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 25315 invoked by uid 99); 4 Mar 2014 20:47:46 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Mar 2014 20:47:46 +0000 Date: Tue, 4 Mar 2014 20:47:46 +0000 (UTC) From: "Hudson (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-10598) Written data can not be read out because MemStore#timeRangeTracker might be updated concurrently MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-10598?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D139= 19988#comment-13919988 ]=20 Hudson commented on HBASE-10598: -------------------------------- SUCCESS: Integrated in HBase-TRUNK #4978 (See [https://builds.apache.org/jo= b/HBase-TRUNK/4978/]) HBASE-10624 Fix 2 new findbugs warnings introduced by HBASE-10598 (tedyu: r= ev 1574149) * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionser= ver/StoreFile.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionser= ver/TimeRangeTracker.java > Written data can not be read out because MemStore#timeRangeTracker might = be updated concurrently > -------------------------------------------------------------------------= ----------------------- > > Key: HBASE-10598 > URL: https://issues.apache.org/jira/browse/HBASE-10598 > Project: HBase > Issue Type: Bug > Components: regionserver > Affects Versions: 0.94.16 > Reporter: cuijianwei > Assignee: cuijianwei > Priority: Critical > Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.18 > > Attachments: HBASE-10598-0.94-v2.patch, HBASE-10598-0.94.v1.patch= , HBASE-10598-trunk-v1.patch > > > In our test environment, we find written data can't be read out occasiona= lly. After debugging, we find that maximumTimestamp/minimumTimestamp of Mem= Store#timeRangeTracker might decrease/increase when MemStore#timeRangeTrack= er is updated concurrently, which might make the MemStore/StoreFile to be f= iltered incorrectly when reading data out. Let's see how the concurrent upd= ating of timeRangeTracker#maximumTimestamp cause this problem.=20 > Imagining there are two threads T1 and T2 putting two KeyValues kv1 and k= v2. kv1 and kv2 belong to the same Store(so belong to the same region), but= contain different rowkeys. Consequently, kv1 and kv2 could be updated conc= urrently. When we see the implementation of HRegionServer#multi, kv1 and kv= 2 will be add to MemStore by HRegion#applyFamilyMapToMemstore in HRegion#do= MiniBatchMutation. Then, MemStore#internalAdd will be invoked and MemStore#= timeRangeTracker will be updated by TimeRangeTracker#includeTimestamp as fo= llows: > {code} > private void includeTimestamp(final long timestamp) { > ... > else if (maximumTimestamp < timestamp) { > maximumTimestamp =3D timestamp; > } > return; > } > {code} > Imagining the current maximumTimestamp of TimeRangeTracker is t0 before i= ncludeTimestamp(...) invoked, kv1.timestamp=3Dt1, kv2.timestamp=3Dt2, t1 a= nd t2 are both set by user(then, user knows the timestamps of kv1 and kv2),= and t1 > t2 > t0. T1 and T2 will be executed concurrently, therefore, the = two threads might both find the current maximumTimestamp is less than the t= imestamp of its kv. After that, T1 and T2 will both set maximumTimestamp to= timestamp of its kv. If T1 set maximumTimestamp before T2 doing that, the = maximumTimestamp will be set to t2. Then, before any new update with bigger= timestamp has been applied to the MemStore, if we try to read out kv1 by H= Table#get and set the timestamp of 'Get' to t1, the StoreScanner will decid= e whether the MemStoreScanner(imagining kv1 has not been flushed) should be= selected as candidate scanner by MemStoreScanner#shouldUseScanner. Then, t= he MemStore won't be selected in MemStoreScanner#shouldUseScanner because m= aximumTimestamp of the MemStore has been set to t2 (t2 < t1). Consequently,= the written kv1 can't be read out and kv1 is lost from user's perspective. > If the above analysis is right, after maximumTimestamp of MemStore#timeRa= ngeTracker has been set to t2, user will experience data lass in the follow= ing situations: > 1. Before any new write with kv.timestamp > t1 has been add to the MemSto= re, read request of kv1 with timestamp=3Dt1 can not read kv1 out. > 2. Before any new write with kv.timestamp > t1 has been add to the MemSto= re, if a flush happened, the data of MemStore will be flushed to StoreFile = with StoreFile#maximumTimestamp set to t2. After that, any read request wit= h timestamp=3Dt1 can not read kv1 before next compaction(Actually, kv1.time= stamp might not be included in timeRange of the StoreFile even after compac= tion). > The second situation is much more serious because the incorrect timeRange= of MemStore has been persisted to the file.=20 > Similarly, the concurrent update of TimeRangeTracker#minimumTimestamp may= also cause this problem. > As a simple way to fix the problem, we could add synchronized to TimeRang= eTracker#includeTimestamp so that this method won't be invoked concurrently= . -- This message was sent by Atlassian JIRA (v6.2#6252)