hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1288) DistributedCache localizes only once per cache URI
Date Tue, 27 Jul 2010 22:55:18 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892971#action_12892971

Owen O'Malley commented on MAPREDUCE-1288:

(2) introduce the concept of group sharing of distributed cache files so as to avoid repetitive
downloads for group shared files also. This may be a complex solution after all.

This would be quite complex to get right. In particular, it is difficult to determine which
group should have access. If we want to improve it, I'd suggest that we use hardlinks to give
each user access to a single copy of the file.. Of course you need to ensure that they do
in fact have read access to the original file. *smile*

> DistributedCache localizes only once per cache URI
> --------------------------------------------------
>                 Key: MAPREDUCE-1288
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1288
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: distributed-cache, security, tasktracker
>    Affects Versions: 0.21.0
>            Reporter: Devaraj Das
>            Priority: Critical
>         Attachments: MR-1288-bp20-1.patch, MR-1288-bp20-2.patch, MR-1288-bp20-3.patch
> As part of the file localization the distributed cache localizer creates a copy of the
file in the corresponding user's private directory. The localization in DistributedCache assumes
the key as the URI of the cachefile and if it already exists in the map, the localization
is not done again. This means that another user cannot access the same distributed cache file.
We should change the key to include the username so that localization is done for every user.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message