hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Collins (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions
Date Thu, 09 Jun 2011 07:15:59 GMT

     [ https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Eli Collins updated HDFS-988:
-----------------------------

    Attachment: hdfs-988-6.patch

Thanks for the thorough review Todd!

bq. computeDatanodeWork calls  lockManager.computeReplicationWork and blockManager.computeInvalidateWork...

Agree. Added a comment in computeDatanodeWork with the rationale.

bq. In heartbeatCheck, I think we can simply put another "if (isInSafeMode()) return" in right
after it takes the writeLock if it finds a dead node...

Agree, added the additional check, and a comment above the first lock aquisition.

bq. isLockedReadOrWrite should be checking this.fsLock.getReadHoldCount()
rather than getReadLockCount()

Fixed.

bq. FSDirectory#bLock says it protects the block map, but it also protects the directory,
right? we should update the comment and perhaps the name.

Correct. I renamed it dirLock to be consistent with FSNameSystem and to reflect that it protects
the entire directory, not just the block map. Updated the comment too.

bq. various functions don't take the read lock because they call functions in FSDirectory
that take FSDirectory.bLock. This seems incorrect..

I think the FSN methods do need to get the readlock before calling into FSD methods that get
the read lock. In the latest patch I added this to the places where it was missing (getFileInfo,
getContentSummary, getPreferredBlockSize, etc), hopefully the new asserts make it clear that
we have the FSN lock we calling into FSD (where necessary). For HADOOP-7362 we should assert
the lock hierarchy is obeyed.

bq. handleHeartbeat calls getDatanode() while only holding locks on heartbeats and datanodeMap,
but registerDatanode mutates datanodeMap without locking either.

I modified handleHeartbeat to take the read lock so it's synchronized with register datanode.
I also added a missing synchronization of datanodeMap to wipeDatanode. I think the locking
in FSN needs to be revisited (eg the way heartbeats and datanodeMap are locked) need to be
reconsidered in light of this locking. I'll file a jira for that, that's probably out of scope
for this jira.

bq. getDataNodeInfo seems like an unused function with no locking - can we remove it?

Yes. Done.

bq. several other places access datanodeMap with synchronization on that object itself.  unprotectedAddDatanode
should assert it holds that monitor lock

Made unprotectedAddDatanode datanodeMap synchronize on datanodeMap, per above I think we need
to revisit how this datastructure is accessed.

bq. when loading the edit log, why doesn't loadFSEdits take a write lock on the namesystem
before it starts? then we could add all of the asserts and not worry about it.

Done. It now takes the namesystem and dir locks, and I modifed all the unprotected methods
to assert the lock is held.

bq. it looks like saving the image no longer works, since saveFilesUnderConstruction now takes
the readLock, but it's being called by a different thread than took the write lock in saveNamespace.

Yup, TestEditLogRace quickly found this deadlock. I removed the new locking from saveFilesUnderConstruction.

bq. When create() is called with the overwrite flag true, that calls delete() which will logSync()
while holding the lock. We can hold off on fixing it since it's a performance problem, not
correctness, and the operation is fairly rare.

Filed HDFS-2052.

bq. getAdditionalBlock doesn't logSync() - I think there's another issue pending about that
since it will affect HA. Let's address later.

Since getAdditionalBlock doesn't do anything that writes to the log the sync is not currently
needed. I updated HDFS-978 to indicate it will be needed when block allocations are logged.

bq. checkFileProgress doesn't really need the write lock

Yup, modified it to use a read-lock.

bq. seems like saveNamespace could safely just take the read lock to allow other readers to
keep working

Yup, modified to take the read lock.

Fixed the nits.

The tests should be passing, I'm going to do some testing with jcarder enabled. Have also
run TestNNThroughputBenchmark, the numbers are noisy, mostly better than w/o the patch (not
sure I believe that).

> saveNamespace can corrupt edits log, apparently due to race conditions
> ----------------------------------------------------------------------
>
>                 Key: HDFS-988
>                 URL: https://issues.apache.org/jira/browse/HDFS-988
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.20-append, 0.21.0, 0.22.0
>            Reporter: dhruba borthakur
>            Assignee: Eli Collins
>            Priority: Blocker
>             Fix For: 0.20-append, 0.22.0
>
>         Attachments: 988-fixups.txt, HDFS-988_fix_synchs.patch, hdfs-988-2.patch, hdfs-988-3.patch,
hdfs-988-4.patch, hdfs-988-5.patch, hdfs-988-6.patch, hdfs-988-b22-1.patch, hdfs-988.txt,
saveNamespace.txt, saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the savenamespace command.
This can corrupt the edits log. The problem is that  when the NN enters safemode, there could
still be pending logSycs occuring from other threads. Now, the saveNamespace command, when
executed, would save a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message