hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhihai xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3491) PublicLocalizer#addResource is too slow.
Date Mon, 27 Apr 2015 06:05:39 GMT

    [ https://issues.apache.org/jira/browse/YARN-3491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513580#comment-14513580

zhihai xu commented on YARN-3491:

Hi [~jira.shegalov], thanks for the information, Could you give some details about how to
switch to {{io.nativeio.NativeIO.POSIX#getFstat}} ?
Currently the attached patch is trying to limit the number of times to call {{getInitializedLocalDirs}}
, even we switch to {io.nativeio.NativeIO.POSIX#getFstat}} , the attached patch should also
be useful. IMHO it will be good to decrease the number of times to call {{getInitializedLocalDirs}}
and {{getInitializedLogDirs}} no matter which API we use.

Should we create a separate following-up JIRA for switching to {{io.nativeio.NativeIO.POSIX#getFstat}}?

> PublicLocalizer#addResource is too slow.
> ----------------------------------------
>                 Key: YARN-3491
>                 URL: https://issues.apache.org/jira/browse/YARN-3491
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager
>    Affects Versions: 2.7.0
>            Reporter: zhihai xu
>            Assignee: zhihai xu
>            Priority: Critical
>         Attachments: YARN-3491.000.patch, YARN-3491.001.patch, YARN-3491.002.patch
> Based on the profiling, The bottleneck in PublicLocalizer#addResource is getInitializedLocalDirs.
getInitializedLocalDirs call checkLocalDir.
> checkLocalDir is very slow which takes about 10+ ms.
> The total delay will be approximately number of local dirs * 10+ ms.
> This delay will be added for each public resource localization.
> Because PublicLocalizer#addResource is slow, the thread pool can't be fully utilized.
Instead of doing public resource localization in parallel(multithreading), public resource
localization is serialized most of the time.
> And also PublicLocalizer#addResource is running in Dispatcher thread, 
> So the Dispatcher thread will be blocked by PublicLocalizer#addResource for long time.

This message was sent by Atlassian JIRA

View raw message