hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2874) HA: edit log should log to shared dirs before local dirs
Date Sat, 04 Feb 2012 02:31:53 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200281#comment-13200281
] 

Todd Lipcon commented on HDFS-2874:
-----------------------------------

I ran two manual tests using mdadm fault injection:

Test 1:
- Set up the shared edits dir in an HA setup on the faulty mount
- started both NNs, put NN1 in active
- turned the mount to "write-all" fault mode (block further writes)
- issued a "touchz" command to the FS. The NN crashed with the following:
{code}
12/02/03 18:12:51 FATAL namenode.FSEditLog: Error: flush failed for required journal (JournalAndStream(mgr=FileJournalManager(root=/mnt/todd/name-shared),
stream=org.apache.hadoop.hdfs.server.namenode.EditLogFileOutputStream@2ad1918a))
java.io.IOException: Input/output error
...
12/02/03 18:12:51 FATAL namenode.FSEditLog: Could not sync enough journals to persistent storage.
Unsynced transactions: 2
{code}
- Verified using xxd that the transaction was not written to the local storage directories
either.

Test 2:
- Set up a non-HA NN with multiple directories, one of which was on the faulty storage
- Set the mount to block all writes
- Issued a touchz command
- NN correctly disabled the bad mount:
{code}
12/02/03 18:20:11 ERROR namenode.FSEditLog: Error: flush failed for (journal JournalAndStream(mgr=FileJournalManager(root=/mnt/todd/name),
stream=org.apache.hadoop.hdfs.server.namenode.EditLogFileOutputStream@7b0b23cf))
{code}
... and kept running as expected
- Shut down the NN
- Cleared the fault and remounted
- Verified with xxd that the edit was persisted to the non-faulty directories
- Restarted NN. Verified existence of the file that I had touched. Startup messages included:
{code}
12/02/03 18:27:18 INFO namenode.FileJournalManager: Recovering unfinalized segments in /tmp/name1-name/current
12/02/03 18:27:18 INFO namenode.FileJournalManager: Finalizing edits file /tmp/name1-name/current/edits_inprogress_0000000000000000001
-> /tmp/name1-name/current/edits_0000000000000000001-0000000000000000006
12/02/03 18:27:18 INFO namenode.FileJournalManager: Recovering unfinalized segments in /tmp/name1-name2/current
12/02/03 18:27:18 INFO namenode.FileJournalManager: Finalizing edits file /tmp/name1-name2/current/edits_inprogress_0000000000000000001
-> /tmp/name1-name2/current/edits_0000000000000000001-0000000000000000006
12/02/03 18:27:18 INFO namenode.FileJournalManager: Recovering unfinalized segments in /mnt/todd/name/current
12/02/03 18:27:18 INFO namenode.FileJournalManager: Finalizing edits file /mnt/todd/name/current/edits_inprogress_0000000000000000001
-> /mnt/todd/name/current/edits_0000000000000000001-0000000000000000003
{code}
(/mnt/todd was the mount which had died - hence it had fewer edits)



                
> HA: edit log should log to shared dirs before local dirs
> --------------------------------------------------------
>
>                 Key: HDFS-2874
>                 URL: https://issues.apache.org/jira/browse/HDFS-2874
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ha, name-node
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>         Attachments: hdfs-2874.txt, hdfs-2874.txt, hdfs-2874.txt
>
>
> Currently, the NN logs its edits to each of its edits directories in sequence. This can
produce the following bad sequence:
> - NN accumulates 100 edits (tx 1-100) in the buffer. Writes and syncs to local drive,
then crashes
> - Failover occurs. SBN takes over at txid=1, since txid 1 never got writen.
> - First NN restarts. It reads up to txid 100 from its local directories. It is now "ahead"
of the active NN with inconsistent state.
> The solution is to write to the shared edits dir, and sync that, before writing to any
local drives.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message