hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5790) LeaseManager.findPath is very slow when many leases need recovery
Date Thu, 30 Jan 2014 22:54:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887213#comment-13887213
] 

Suresh Srinivas commented on HDFS-5790:
---------------------------------------

I know that many of the HDFS restarts with running jobs that have opened many files run into
this issue. In the past I had fixed a bug where namenode did editlog sync holding lock. Even
with that I see that this issue slows down lease recovery and namenode in such restarts becomes
unresponsive. That said, I am okay not putting this into 2.3.

> LeaseManager.findPath is very slow when many leases need recovery
> -----------------------------------------------------------------
>
>                 Key: HDFS-5790
>                 URL: https://issues.apache.org/jira/browse/HDFS-5790
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode, performance
>    Affects Versions: 2.3.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 3.0.0, 2.4.0
>
>         Attachments: hdfs-5790.txt, hdfs-5790.txt
>
>
> We recently saw an issue where the NN restarted while tens of thousands of files were
open. The NN then ended up spending multiple seconds for each commitBlockSynchronization()
call, spending most of its time inside LeaseManager.findPath(). findPath currently works by
looping over all files held for a given writer, and traversing the filesystem for each one.
This takes way too long when tens of thousands of files are open by a single writer.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message