hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-955) FSImage.saveFSImage can lose edits
Date Wed, 10 Feb 2010 21:32:36 GMT

    [ https://issues.apache.org/jira/browse/HDFS-955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832212#action_12832212
] 

Todd Lipcon commented on HDFS-955:
----------------------------------

h2. Without concurrent checkpoint

Looking at this as a series of state transitions on the storage directory:

{noformat}
State 1: normal operation
  Valid: IMAGE + EDITS
  (there is nothing special happening)

State 2: createNewIfNotExists
  Valid a: IMAGE + EDITS
  Valid b: IMAGE + EDITS + EDITS_NEW (since EDITS_NEW is empty)
  Current recovery: b

State 3: saving IMAGE_NEW,
  Same validity as State 2
  Current recovery: b

State 4: save IMAGE_NEW complete
  Valid a: IMAGE + EDITS
  Valid b: IMAGE + EDITS + EDITS_NEW (since EDITS_NEW is empty)
  Valid c: IMAGE_NEW + EDITS_NEW (since EDITS_NEW is empty)
  Current recovery: b

State 5: truncate EDITS and EDITS_NEW
  (a) and (b) are no longer valid
  Valid c: IMAGE_NEW + EDITS_NEW
  Current recovery: *b (this is the error we're seeing)

State 6: rollFSImage -> purgeEditLog: moves EDITS_NEW to EDITS
  Valid: IMAGE_NEW + EDITS
  Current recovery: rename IMAGE_NEW to IMAGE (correct)

State 7: rollFSImage -> renameCheckpoint: moves IMAGE_NEW to IMAGE
  Valid: IMAGE + EDITS
  Current recovery: no recovery necessary (correct)
{noformat}

The problem here is in State 5. The question is how to detect that
we are in this state during recovery so we can do the right thing.
This is where HDFS-957 comes in. With 957, the recovery logic can easily
determine that IMAGE_NEW is correct, and choose the same recovery
mechanism as state 6.

h2. With ongoing checkpoint (logs start rolled)

{noformat}
State 1: normal operation
  Valid: IMAGE + EDITS + EDITS_NEW
  Recovery: IMAGE + EDITS + EDITS_NEW

State 2: createNewIfNotExists
  no effect - NEW already exists

State 3: saving IMAGE_NEW,
  Same validity as State 2
  Current recovery: IMAGE + EDITS + EDITS_NEW

State 4: save IMAGE_NEW complete
  Valid a: IMAGE + EDITS + EDITS_NEW
  Valid b: IMAGE_NEW only
  Current recovery: a

State 5: truncate EDITS and EDITS_NEW
  Valid: IMAGE_NEW (any other recovery is incorrect)
  Current recovery: IMAGE + EDITS + EDITS_NEW (incorrect, loses data)

State 6: rollFSImage -> purgeEditLog: moves EDITS_NEW to EDITS
  Valid: IMAGE_NEW + EDITS
  Current recovery: rename IMAGE_NEW to IMAGE (correct)

State 7: rollFSImage -> renameCheckpoint: moves IMAGE_NEW to IMAGE
  Valid: IMAGE + EDITS
  Current recovery: no recovery necessary (correct)
{noformat}


So the issue in both cases is essentially the same, and both can be solved
if we use HDFS-957.

I'll work on a patch for this.

On a side note, I think there's another race where a checkpoint upload from the SNN can
overlap with this operation and really screw things up. That's a separate JIRA though.


> FSImage.saveFSImage can lose edits
> ----------------------------------
>
>                 Key: HDFS-955
>                 URL: https://issues.apache.org/jira/browse/HDFS-955
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.20.1, 0.21.0, 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Blocker
>         Attachments: hdfs-955-unittest.txt, PurgeEditsBeforeImageSave.patch
>
>
> This is a continuation of a discussion from HDFS-909. The FSImage.saveFSImage function
(implementing dfsadmin -saveNamespace) can corrupt the NN storage such that all current edits
are lost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message