hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Payne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2184) Revisit Namenode locking
Date Fri, 22 Jul 2011 14:40:57 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069567#comment-13069567
] 

Eric Payne commented on HDFS-2184:
----------------------------------

Hi Eli,

I noticed a slowdown in the performance of the namenode lately (in the past month or so) when
it is under stress. I also noticed a striking increase in reliability of the namenode under
stress as well. But, there are probably opportunities in the FSNamesystem and FSDirectory
classes to improve performance.

Thanks,
-Eric

> Revisit Namenode locking
> ------------------------
>
>                 Key: HDFS-2184
>                 URL: https://issues.apache.org/jira/browse/HDFS-2184
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: Eli Collins
>
> While working on HDFS-988 I noticed that the locking in FSNamesystem and FSDirectory
could be improved. Some observations:
> The namesystem lock (fsLock) is always taken before acquiring the directory lock (dirLock).
Therefore the only time when the directory lock is needed is when the fsLock is taken for
reading and the directory lock is taken for writing, but I don't think that ever happens.
Therefore we can probably get rid of the directory lock.
> In HDFS-988 I modified handleHeartbeat to take the read lock so it's synchronized with
register datanode. I also added a missing synchronization of datanodeMap to wipeDatanode because
handleHeartbeat calls getDatanode() while only holding locks on heartbeats and datanodeMap,
but registerDatanode mutates datanodeMap without locking either. We should revisit which locks/synchronization
protect which data structures, there may be other similar bugs and also opportunities to increase
parallelism.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message