hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Kanter (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-5706) toBeDeleted parent directories aren't being cleaned up
Date Wed, 02 Apr 2014 20:17:16 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-5706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robert Kanter updated MAPREDUCE-5706:
-------------------------------------

    Status: Patch Available  (was: Open)

> toBeDeleted parent directories aren't being cleaned up
> ------------------------------------------------------
>
>                 Key: MAPREDUCE-5706
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5706
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 0.22.0
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: MAPREDUCE-5706.patch
>
>
> When security is enabled on 0.22, MRASyncDiskService doesn't always delete the parent
directories under {{toBeDeleted}}.
> MRAsyncDiskService goes through {{toBeDeleted}} and creates "tasks" to delete the directories
under there using the LinuxTaskController. It chooses which user to run as by looking at who
owns that directory.
> For example:
> {noformat}
> ls -al /mapred/local/toBeDeleted/2013-07-05_05-37-49.052_0
> total 12
> drwxr-xr-x 3 mapred mapred 4096 Jul  5 05:37 .
> drwxr-xr-x 5 mapred mapred 4096 Dec 19 10:15 ..
> drwxr-s--- 4 test   mapred 4096 Jul  2 02:54 test
> {noformat}
> It would create a task to use "test" user to delete /mapred/local/toBeDeleted/2013-07-05_05-37-49.052_0/test
(there could be more in there for other users). It then creates a task to use "mapred" user
to delete /mapred/local/toBeDeleted/2013-07-05_05-37-49.052_0.
> So, the problem is that we normally configure "mapred" to not be allowed by the LinuxTaskController
in the /etc/hadoop/conf.cloudera.mapreduce1/taskcontroller.cfg.  The permissions on the toBeDeleted
dir is drwxr-xr-x mapred:mapred, which means that only "mapred" can delete things in it (i.e.
the timestamped dirs).  However, the MRAsyncDiskService is already running as the mapred user,
so there's no reason to use the LinuxTaskController for impersonation anyway; we can directly
do it from the Java code.
> Another issue is that {{MRAsyncDiskService#deletePathsInSecureCluster}} expects an absolute
file path (e.g. {{/mapred/local/toBeDeleted/2013-07-05_05-37-49.052_0}}, but {{MRAsyncDiskService#moveAndDeleteRelativePath}}
passes in a relative path (e.g. {{toBeDeleted/2013-07-05_05-37-49.052_0}}).  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message