hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2223) Untangle depencencies between NN components
Date Tue, 06 Sep 2011 00:17:10 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13097647#comment-13097647

Todd Lipcon commented on HDFS-2223:

Looking into the -1s, I see that this is causing TestEditLogRace.testSaveNamespace to time
out. It's a kind of messy situation -- the FSN rwlock is deadlocking against its fairness
- saveNamespace acquires the read lock
- spawns an image saver thread
- another thread comes in to do mkdirs and is waiting on the write lock
- the image saver thread calls getNamespaceInfo (new in this patch) and wants the read lock
Fairness policy says that the image saver thread can't get the read lock, since someone is
already waiting on the write lock. So, it hangs there. The writer hangs on the main thread,
and the main thread is join()ing on the saver thread.

I think the easiest solution is to unsynchronize getNamespaceInfo, since in practice none
of those fields change after the FSN is initialized. I will upload a new patch.

> Untangle depencencies between NN components
> -------------------------------------------
>                 Key: HDFS-2223
>                 URL: https://issues.apache.org/jira/browse/HDFS-2223
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-2223-1.txt, hdfs-2223-2.txt, hdfs-2223-3.txt, hdfs-2223-4.txt,
hdfs-2223-5.txt, hdfs-2223-6.txt, hdfs-2223-7.txt, hdfs-2223-8.txt
> Working in the NN a lot for HA (HDFS-1623) I've come across a number of situations where
the tangled dependencies between NN components has been problematic for adding new features
and for testability. It would be good to untangle some of these and clarify what the distinction
is between the different components: NameNode, FSNamesystem, FSDirectory, FSImage, NNStorage,
and FSEditLog

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message