hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kihwal Lee (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-4186) logSync() is called with the write lock held while releasing lease
Date Wed, 14 Nov 2012 05:14:14 GMT

     [ https://issues.apache.org/jira/browse/HDFS-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kihwal Lee updated HDFS-4186:
-----------------------------

    Attachment: hdfs-4186-trunk.patch

The patch makes logSync() deferred a bit for lease reassignment. From several methods in FSNamesapce
that may cause lease reassignment are now calling logSync() in their finally block after releasing
the write lock.  It needs to be in a finally block, since lease recovery can throw an exception
after lease reassignment.

logSync() was modified to look at a new thread local variable for determining whether there
is any logged transactions by the thread.  When this variable is false, it returns right away.
This variable is set whenever logEdit() is called and cleared when the log is synced. This
is also unconditionally set for logSyncAll().  This does not indicate the sync state. I.e.
even if it is true, another thread might have already synced all transactions by this thread.

In the lease monitor thread, logSync() is called after checkLeases() is done and the write
lock is released. All transactions logged in checkLeases() will be batched.
                
> logSync() is called with the write lock held while releasing lease
> ------------------------------------------------------------------
>
>                 Key: HDFS-4186
>                 URL: https://issues.apache.org/jira/browse/HDFS-4186
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.23.4, 2.0.2-alpha
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>            Priority: Critical
>         Attachments: hdfs-4186-trunk.patch
>
>
> As pointed out in HDFS-4138, when the lease monitor calls internalReleaseLease(), it
acquires the namespace write lock. Inside internalReleaseLease(), if a block recovery is needed,
the lease is reassigned to the namenode itself and this is logged & synced in logReassignLease().
> Since this is done while the write lock is held, log syncing is blocked. When a large
number of leases are expired and blocks are recovered, namenode can slow down.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message