hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raghu Angadi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-752) Possible locking issues in HDFS Namenode
Date Tue, 28 Nov 2006 19:18:23 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-752?page=comments#action_12454079 ] 
Raghu Angadi commented on HADOOP-752:

DFSNodesStatus() locks 'heartBeats' and 'datanodeMap'.  As you noted these are not locked
in registerDatanode().

I think we should have explicitly stated policy about the locking, which locks are held to
protect which state. This will help anyone writing new code or reading the code.

> Possible locking issues in HDFS Namenode
> ----------------------------------------
>                 Key: HADOOP-752
>                 URL: http://issues.apache.org/jira/browse/HADOOP-752
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>         Assigned To: dhruba borthakur
> I have been investigating the cause of random Namenode memory corruptions/memory overflows,
etc. Please comment.
>  1. The functions datanodeReport() and DFSNodesStatus() do not acquire the global lock.
>    This can race with another thread invoking registerDatanode(). registerDatanode()
>    can remove a datanode (thru wipeDatanode()) while the datanodeReport thread is
>    traversing the list of datanodes. This can cause exceptions to occur.
>  2. The blocksMap is protected by the global lock. The setReplication() call does not
>    the global lock when it calls proccessOverReplicatedBlock(). This can cause corruption
in blockMap.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message