hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1137) Name node is using the write-ahead log improperly
Date Sat, 08 May 2010 04:38:50 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12865397#action_12865397

Todd Lipcon commented on HDFS-1137:

I always assumed this is entirely on purpose. Because of the coarse grained locking in FSNamesystem,
"fixing" this would basically serialize all writes 1:1 with syncs to the edit log, which would
drastically decrease write throughput.

We already do sync() before returning to the writer, so any write that the writer thinks is
successful is guaranteed to be durable. It's just that other readers may see things that were
not made durable.

I think this is perfectly acceptable for a filesystem, and it's exactly what you see in systems
like ext3 - writes to the metadata journal are not synced unless you explicitly call fsync(),
so a reader can read data which will disappear after a crash.

> Name node is using the write-ahead log improperly
> -------------------------------------------------
>                 Key: HDFS-1137
>                 URL: https://issues.apache.org/jira/browse/HDFS-1137
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>            Reporter: Benjamin Reed
> The Name node is doing the write-ahead log (WAL) (aka edit log) improperly. Usually when
using WAL, changes are written to the log before they are applied to the state. Currently
the Namenode does the WAL after applying the change. This means that read may see changes
before they are durable. A client may read information and the server fail before the information
is written to the WAL, which results in the client reading state that disappears. To fix the
Namenode should write changes before (aka ahead of) applying the change.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message