hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hemanth Yamijala (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MAPREDUCE-1435) symlinks in cwd of the task are not handled properly after MAPREDUCE-896
Date Thu, 04 Mar 2010 09:34:27 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hemanth Yamijala updated MAPREDUCE-1435:
----------------------------------------

    Attachment: 1435.v4.patch

Patch incorporates review comments from Amarsri and Ravi. Changes are:

- I am now using ClusterWithLinuxTaskController.taskTrackerSpecialGroup as the expected group
for private distributed cache files.
- Added ownership and group ownership checks for public distributed cache files. Group owner
for public distributed cache is the primary owner of the tasktracker. I added a ClusterWithLinuxTaskController.taskTrackerPrimaryGroup
on similar lines as ClusterWithLinuxTaskController.taskTrackerSpecialGroup.

However,

bq. Once we add that also to the checks of public distributed cache files, then ClusterWithLinuxTaskController.checkPermissionsOnDir()
can be reused for these checks also and can avoid TestTrackerDistributedCacheManager.checkPublicFilePermissions()
possibly.

I have not done the above. This is because the checks for permissions of private distributed
cache files includes exact match of all the permissions for owner, group and others. For public
distributed cache files, the code only adds 'read' and 'execute' bits for all users. Specifically,
it does not modify the 'write' bits. This means that the write permissions are indeterminate
(for e.g. they could depend on permissions of files in an archive which are unarchived in
distributed cache). Hence, instead of reusing the model for checking permissions, I have retained
the original model for checking permissions of the public cache files.

I ran all task-controller tests on this, and they passed.

> symlinks in cwd of the task are not handled properly after MAPREDUCE-896
> ------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1435
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1435
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.22.0
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Ravi Gummadi
>             Fix For: 0.22.0
>
>         Attachments: 1435.patch, 1435.v1.patch, 1435.v2.patch, 1435.v3.patch, 1435.v4.patch,
MR-1435-y20s.patch
>
>
> With JVM reuse, TaskRunner.setupWorkDir() lists the contents of workDir and does a fs.delete
on each path listed. If the listed file is a symlink to directory, it will delete the contents
of those linked directories. This would delete files from distributed cache and jars directory,if
mapred.create.symlink is true.
> Changing ownership/permissions of symlinks through ENABLE_TASK_FOR_CLEANUP would change
ownership/permissions of underlying files.
> This is observed by Karam while running streaming jobs with DistributedCache and jvm
reuse.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message