hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Trezzo (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3637) Handle localization sym-linking correctly at the YARN level
Date Fri, 20 Jan 2017 23:08:26 GMT

    [ https://issues.apache.org/jira/browse/YARN-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15832580#comment-15832580

Chris Trezzo commented on YARN-3637:

bq. It might be better to overload the use() method instead of replacing it.

[~templedf] Thinking about your previous comment some more, I may have missed your point the
first time. I now realize that the overridden use method can simply honor the fragment portion
of the url. If there is no fragment, then we can just use the original path's name as a new
fragment to preserve the resource name. This can provide the same functionality without the
extra parameter. I will fix the patch and post a new version. Let me know if you had something
different in mind. Thanks again!

> Handle localization sym-linking correctly at the YARN level
> -----------------------------------------------------------
>                 Key: YARN-3637
>                 URL: https://issues.apache.org/jira/browse/YARN-3637
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Chris Trezzo
>            Assignee: Chris Trezzo
>         Attachments: YARN-3637-trunk.001.patch, YARN-3637-trunk.002.patch
> The shared cache needs to handle resource sym-linking at the YARN layer. Currently, we
let the application layer (i.e. mapreduce) handle this, but it is probably better for all
applications if it is handled transparently.
> Here is the scenario:
> Imagine two separate jars (with unique checksums) that have the same name job.jar.
> They are stored in the shared cache as two separate resources:
> checksum1/job.jar
> checksum2/job.jar
> A new application tries to use both of these resources, but internally refers to them
as different names:
> foo.jar maps to checksum1
> bar.jar maps to checksum2
> When the shared cache returns the path to the resources, both resources are named the
same (i.e. job.jar). Because of this, when the resources are localized one of them clobbers
the other. This is because both symlinks in the container_id directory are the same name (i.e.
job.jar) even though they point to two separate resource directories.
> Originally we tackled this in the MapReduce client by using the fragment portion of the
resource url. This, however, seems like something that should be solved at the YARN layer.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message