hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yunfan Zhong (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-10464) Race condition during RS shutdown that could cause data loss
Date Tue, 04 Feb 2014 21:46:10 GMT
Yunfan Zhong created HBASE-10464:
------------------------------------

             Summary: Race condition during RS shutdown that could cause data loss
                 Key: HBASE-10464
                 URL: https://issues.apache.org/jira/browse/HBASE-10464
             Project: HBase
          Issue Type: Bug
          Components: regionserver
    Affects Versions: 0.89-fb
            Reporter: Yunfan Zhong
            Priority: Critical
             Fix For: 0.89-fb


Bug scenario (T* are timestamps, say T1 < T2 < ... < Tn):
1. Master assigns a region to RS at T1
2. RS works on opening the region during T1 to T3
3. In the mean time of opening the region, RS starts to shut down at T2, and dfs client is
closed at T5.
4. Regions owned by the RS get closed as a step of RS shutdown except that the newly opened
region is online during T3 to T5 and holds some mutations in memory after possible last flush
T4.
5. Since master thinks RS has a clean shutdown, there is no log splitting. The HLog was moved
to old logs directory naturally.
6. Mutations in memory between T4 to T5 (if T4 does not exist, T3 to T5) are not flushed.
They only exist in WAL if it is turned on.

Fix is to prevent region opening from succeeding when the RS is shutting down.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message