hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhihai xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4095) Avoid sharing AllocatorPerContext object in LocalDirAllocator between ShuffleHandler and LocalDirsHandlerService.
Date Fri, 07 Apr 2017 03:50:41 GMT

    [ https://issues.apache.org/jira/browse/YARN-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15960233#comment-15960233
] 

zhihai xu commented on YARN-4095:
---------------------------------

[~Feng Yuan], I think, For ShuffleHandler, we always want it to access all local directories
which also include full local directories, since output data from mappers may be at the full
local directories. Otherwise shuffle may fail due to data or index file can't be found in
the good local directories.
{code}
          Path indexFileName = lDirAlloc.getLocalPathToRead(
              attemptBase + "/" + INDEX_FILE_NAME, conf);
          Path mapOutputFileName = lDirAlloc.getLocalPathToRead(
              attemptBase + "/" + DATA_FILE_NAME, conf);
    public Path getLocalPathToRead(String pathStr,
        Configuration conf) throws IOException {
      Context ctx = confChanged(conf);
      int numDirs = ctx.localDirs.length;
      int numDirsSearched = 0;
      //remove the leading slash from the path (to make sure that the uri
      //resolution results in a valid path on the dir being checked)
      if (pathStr.startsWith("/")) {
        pathStr = pathStr.substring(1);
      }
      while (numDirsSearched < numDirs) {
        Path file = new Path(ctx.localDirs[numDirsSearched], pathStr);
        if (ctx.localFS.exists(file)) {
          return file;
        }
        numDirsSearched++;
      }

      //no path found
      throw new DiskErrorException ("Could not find " + pathStr +" in any of" +
      " the configured local directories");
    }
{code}
I think This may be also the reason why we didn't want to use the same configuration between
ShuffleHandler and LocalDirHandlerService.

> Avoid sharing AllocatorPerContext object in LocalDirAllocator between ShuffleHandler
and LocalDirsHandlerService.
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-4095
>                 URL: https://issues.apache.org/jira/browse/YARN-4095
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager
>            Reporter: zhihai xu
>            Assignee: zhihai xu
>             Fix For: 2.8.0, 3.0.0-alpha1
>
>         Attachments: YARN-4095.000.patch, YARN-4095.001.patch
>
>
> Currently {{ShuffleHandler}} and {{LocalDirsHandlerService}} share {{AllocatorPerContext}}
object in {{LocalDirAllocator}} for configuration {{NM_LOCAL_DIRS}} because {{AllocatorPerContext}}
are stored in a static TreeMap with configuration name as key
> {code}
>   private static Map <String, AllocatorPerContext> contexts = 
>                  new TreeMap<String, AllocatorPerContext>();
> {code}
> {{LocalDirsHandlerService}} and {{ShuffleHandler}} both create a {{LocalDirAllocator}}
using {{NM_LOCAL_DIRS}}. Even they don't use the same {{Configuration}} object, but they will
use the same {{AllocatorPerContext}} object. Also {{LocalDirsHandlerService}} may change {{NM_LOCAL_DIRS}}
value in its {{Configuration}} object to exclude full and bad local dirs, {{ShuffleHandler}}
always uses the original {{NM_LOCAL_DIRS}} value in its {{Configuration}} object. So every
time {{AllocatorPerContext#confChanged}} is called by {{ShuffleHandler}} after {{LocalDirsHandlerService}},
{{AllocatorPerContext}} need be reinitialized because {{NM_LOCAL_DIRS}} value is changed.
This will cause some overhead.
> {code}
>       String newLocalDirs = conf.get(contextCfgItemName);
>       if (!newLocalDirs.equals(savedLocalDirs)) {
> {code}
> So it will be a good improvement to not share the same {{AllocatorPerContext}} instance
between {{ShuffleHandler}} and {{LocalDirsHandlerService}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message