Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2DD7F17B9D for ; Wed, 15 Apr 2015 23:29:59 +0000 (UTC) Received: (qmail 81555 invoked by uid 500); 15 Apr 2015 23:29:59 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 81508 invoked by uid 500); 15 Apr 2015 23:29:59 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 81490 invoked by uid 99); 15 Apr 2015 23:29:58 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 Apr 2015 23:29:58 +0000 Date: Wed, 15 Apr 2015 23:29:58 +0000 (UTC) From: "Sangjin Lee (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-3491) Improve the public resource localization to do both FSDownload submission to the thread pool and completed localization handling in one thread (PublicLocalizer). MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-3491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14497304#comment-14497304 ] Sangjin Lee commented on YARN-3491: ----------------------------------- I have the same question as [~jlowe]. The actual call {code} synchronized (pending) { pending.put(queue.submit(new FSDownload(lfs, null, conf, publicDirDestPath, resource, request.getContext().getStatCache())), request); } {code} should be completely non-blocking and there is nothing that's expensive about it with the possible exception of the synchronization. Could you describe the root cause of the slowness you're seeing in some more detail? > Improve the public resource localization to do both FSDownload submission to the thread pool and completed localization handling in one thread (PublicLocalizer). > ----------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: YARN-3491 > URL: https://issues.apache.org/jira/browse/YARN-3491 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager > Affects Versions: 2.7.0 > Reporter: zhihai xu > Assignee: zhihai xu > Priority: Critical > > Improve the public resource localization to do both FSDownload submission to the thread pool and completed localization handling in one thread (PublicLocalizer). > Currently FSDownload submission to the thread pool is done in PublicLocalizer#addResource which is running in Dispatcher thread and completed localization handling is done in PublicLocalizer#run which is running in PublicLocalizer thread. > Because FSDownload submission to the thread pool at the following code is time consuming, the thread pool can't be fully utilized. Instead of doing public resource localization in parallel(multithreading), public resource localization is serialized most of the time. > {code} > synchronized (pending) { > pending.put(queue.submit(new FSDownload(lfs, null, conf, > publicDirDestPath, resource, request.getContext().getStatCache())), > request); > } > {code} > Also there are two more benefits with this change: > 1. The Dispatcher thread won't be blocked by above FSDownload submission. Dispatcher thread handles most of time critical events at Node manager. > 2. don't need synchronization on HashMap (pending). > Because pending will be only accessed in PublicLocalizer thread. -- This message was sent by Atlassian JIRA (v6.3.4#6332)