hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Trezzo (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-3637) Handle localization sym-linking correctly at the YARN level
Date Tue, 12 May 2015 21:50:01 GMT
Chris Trezzo created YARN-3637:
----------------------------------

             Summary: Handle localization sym-linking correctly at the YARN level
                 Key: YARN-3637
                 URL: https://issues.apache.org/jira/browse/YARN-3637
             Project: Hadoop YARN
          Issue Type: Sub-task
            Reporter: Chris Trezzo
            Assignee: Chris Trezzo


The shared cache needs to handle resource sym-linking at the YARN layer. Currently, we let
the application layer (i.e. mapreduce) handle this, but it is probably better for all applications
if it is handled transparently.

Here is the scenario:
Imagine two separate jars (with unique checksums) that have the same name job.jar.

They are stored in the shared cache as two separate resources:
checksum1/job.jar
checksum2/job.jar

A new application tries to use both of these resources, but internally refers to them as different
names:
foo.jar maps to checksum1
bar.jar maps to checksum2

When the shared cache returns the path to the resources, both resources are named the same
(i.e. job.jar). Because of this, when the resources are localized one of them clobbers the
other. This is because both symlinks in the container_id directory are the same name (i.e.
job.jar) even though they point to two separate resource directories.

Originally we tackled this in the MapReduce client by using the fragment portion of the resource
url. This, however, seems like something that should be solved at the YARN layer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message