hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7396) Revisit synchronization in Namenode
Date Mon, 17 Nov 2014 05:12:34 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14214269#comment-14214269

Colin Patrick McCabe commented on HDFS-7396:

I would echo [~tlipcon]'s comment here: https://issues.apache.org/jira/browse/HDFS-2206?focusedCommentId=13071354&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13071354

If we're revisiting synchronization the first thing we should do is document what the locking
hierarchy is... what locks have to be taken before other locks to avoid deadlock.  This is
something that is very ad-hoc right now in the code, and I'm sure we have a lot of lurking
deadlocks.  Adding even more locks is going to make this even more critical.

I'm also curious what relationship this JIRA has with the idea of separating the block manager
from the NameNode.  If we are going to separate the BlockManager from the FSNamesystem (in
separate daemons), perhaps we need the FSN's calls into the BM to take place without any locks.
 Otherwise, a slow network connection or a laggy RPC could really handicap the NameNode.

> Revisit synchronization in Namenode
> -----------------------------------
>                 Key: HDFS-7396
>                 URL: https://issues.apache.org/jira/browse/HDFS-7396
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Tsz Wo Nicholas Sze
> HDFS-2106 separated block management to a new package from namenode.  As part of it,
some code was refactored to new classes such as DatanodeManager, HeartbeatManager, etc.  There
are opportunities for improve locking in namenode while currently the synchronization in namenode
is mainly done by a single global FSNamesystem lock. 

This message was sent by Atlassian JIRA

View raw message