hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1093) Improve namenode scalability by splitting the FSNamesystem synchronized section in a read/write lock
Date Wed, 14 Jul 2010 18:24:55 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888473#action_12888473
] 

Suresh Srinivas commented on HDFS-1093:
---------------------------------------

bq. I agree with you that there are code portions in processReport, createSymLinkInternal,
startFileInternal() that can move outside the FSNamesystem lock. However, I would like to
avoid doing this code reorganizatin as part of this JIRA, especially because it makes the
code difficult to review. Also, this is not a regression because the original code has all
these code inside the synchronized section anyway. Please let me know if you agree on this
one.
I agree that this jira may not be the right place for this optimization. Compared to earlier
code with synchronized methods, with this change, we can optimize the length of synchronized
sections. We can create a bug to track this optimization.

bq. getBlockLocations() - I now acquire the readLock and attempt to proceed ahead. If the
access-time has to be set, then I release the readLock, acquire the writeLock and start all
over again
Why not check for doAccessTime, if true grab writeLock else readLock? Make doAccessTime parameter
final. This change seems much simpler - no need to repeat the initial steps such as looking
for inode, computing now etc.

bq. removeStoredBlock - assert to be replaced by grab writeLock: I have the impression that
all calls to removeStoredBlock already has the writeLock, that is the reason for the assert.
Do you know of a code path via which this is not the case?
I agree this method is not called without holding the writeLock. However assert is not turned
on during run time. Also this code changes the previous semantics.

bq. Do you have a suggestion on how I can fix the code in FSPermissionChecker.checkPermission()?
This method is called only from FSNamesystem. How about grabbing the readLock and then calling
checkPermission without passing the root INodeDirectory?


> Improve namenode scalability by splitting the FSNamesystem synchronized section in a
read/write lock
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-1093
>                 URL: https://issues.apache.org/jira/browse/HDFS-1093
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: NNreadwriteLock.txt, NNreadwriteLock_trunk_1.txt, NNreadwriteLock_trunk_2.txt,
NNreadwriteLock_trunk_3.txt
>
>
> Most critical data structures in the NameNode (NN) are protected by a syncronized methods
in the FSNamesystem class. This essentially makes critical code paths in the NN single-threaded.
However, a large percentage of the NN calls are listStatus, getBlockLocations, etc which do
not change internal data structures at all, these are read-only calls. If we change the FSNamesystem
lock to a read/write lock, many of the above operations can occur in parallel, thus improving
the scalability of the NN.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message