hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yunfan Zhong (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-10464) Race condition during RS shutdown that could cause data loss
Date Tue, 04 Feb 2014 21:56:11 GMT

     [ https://issues.apache.org/jira/browse/HBASE-10464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Yunfan Zhong updated HBASE-10464:
---------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

> Race condition during RS shutdown that could cause data loss
> ------------------------------------------------------------
>
>                 Key: HBASE-10464
>                 URL: https://issues.apache.org/jira/browse/HBASE-10464
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.89-fb
>            Reporter: Yunfan Zhong
>            Priority: Critical
>             Fix For: 0.89-fb
>
>         Attachments: D1120497.diff
>
>
> Bug scenario (T* are timestamps, say T1 < T2 < ... < Tn):
> 1. Master assigns a region to RS at T1
> 2. RS works on opening the region during T1 to T3
> 3. In the mean time of opening the region, RS starts to shut down at T2, and dfs client
is closed at T5.
> 4. Regions owned by the RS get closed as a step of RS shutdown except that the newly
opened region is online during T3 to T5 and holds some mutations in memory after possible
last flush T4.
> 5. Since master thinks RS has a clean shutdown, there is no log splitting. The HLog was
moved to old logs directory naturally.
> 6. Mutations in memory between T4 to T5 (if T4 does not exist, T3 to T5) are not flushed.
They only exist in WAL if it is turned on.
> Fix is to prevent region opening from succeeding when the RS is shutting down.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message