hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Feng Yuan (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-4095) Avoid sharing AllocatorPerContext object in LocalDirAllocator between ShuffleHandler and LocalDirsHandlerService.
Date Fri, 07 Apr 2017 02:35:42 GMT

    [ https://issues.apache.org/jira/browse/YARN-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959029#comment-15959029
] 

Feng Yuan edited comment on YARN-4095 at 4/7/17 2:35 AM:
---------------------------------------------------------

[~zxu],thanks your patch for this issue.
Excuse me, i am not very clear the goal this patch achieve.Such as avoid the heap memory leak
like YARN-6277?
because in:
{code}
      String newLocalDirs = conf.get(contextCfgItemName);
      if (!newLocalDirs.equals(savedLocalDirs)) {
{code}
it create massive LocalFileSystem objects and cache them.
If your purpose is fix this heap memory leak.  I guess i will understand this issue completetly.
And i have a idea, now that the issue is caused by the configuration is different in two place.
And i notice that ShuffleHandler use a another conf object by clone(conf) method,how about
let "SH" use the same conf?
This leads to several benefits:
1. ShuffleHandler service will timely know which disk is over-used(>95%),and will not write
data to it,avoid some map output 
work to a overload disk and break by error "no space left...".
2. if we could think over the implementation model in your patch, IMHO i feel it is not very
grace just add a new name of local-dir.
Thx.


was (Author: feng yuan):
[~zxu],thanks your patch for this issue.
Excuse me, i am not very clear the goal this patch achieve.Such as avoid the heap memory leak
like YARN-6277,
because in:
{code}
      String newLocalDirs = conf.get(contextCfgItemName);
      if (!newLocalDirs.equals(savedLocalDirs)) {
{code}
it create massive LocalFileSystem objects and cache them.
If your purpose is fix this heap memory leak.  I guess i will understand this issue completetly.
And i have a idea, now that the issue is caused by the configuration is different in two place.
And i notice that ShuffleHandler use a another conf object by clone(conf) method,how about
let "SH" use the same conf?
This leads to several benefits:
1. ShuffleHandler service will timely know which disk is over-used(>95%),and will not write
data to it,avoid some map output 
work to a overload disk and break by error "no space left...".
2. if we could think over the implementation model in your patch, IMHO i feel it is not very
grace just add a new name of local-dir.
Thx.

> Avoid sharing AllocatorPerContext object in LocalDirAllocator between ShuffleHandler
and LocalDirsHandlerService.
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-4095
>                 URL: https://issues.apache.org/jira/browse/YARN-4095
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager
>            Reporter: zhihai xu
>            Assignee: zhihai xu
>             Fix For: 2.8.0, 3.0.0-alpha1
>
>         Attachments: YARN-4095.000.patch, YARN-4095.001.patch
>
>
> Currently {{ShuffleHandler}} and {{LocalDirsHandlerService}} share {{AllocatorPerContext}}
object in {{LocalDirAllocator}} for configuration {{NM_LOCAL_DIRS}} because {{AllocatorPerContext}}
are stored in a static TreeMap with configuration name as key
> {code}
>   private static Map <String, AllocatorPerContext> contexts = 
>                  new TreeMap<String, AllocatorPerContext>();
> {code}
> {{LocalDirsHandlerService}} and {{ShuffleHandler}} both create a {{LocalDirAllocator}}
using {{NM_LOCAL_DIRS}}. Even they don't use the same {{Configuration}} object, but they will
use the same {{AllocatorPerContext}} object. Also {{LocalDirsHandlerService}} may change {{NM_LOCAL_DIRS}}
value in its {{Configuration}} object to exclude full and bad local dirs, {{ShuffleHandler}}
always uses the original {{NM_LOCAL_DIRS}} value in its {{Configuration}} object. So every
time {{AllocatorPerContext#confChanged}} is called by {{ShuffleHandler}} after {{LocalDirsHandlerService}},
{{AllocatorPerContext}} need be reinitialized because {{NM_LOCAL_DIRS}} value is changed.
This will cause some overhead.
> {code}
>       String newLocalDirs = conf.get(contextCfgItemName);
>       if (!newLocalDirs.equals(savedLocalDirs)) {
> {code}
> So it will be a good improvement to not share the same {{AllocatorPerContext}} instance
between {{ShuffleHandler}} and {{LocalDirsHandlerService}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message