hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik Kambatla (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-5706) toBeDeleted parent directories aren't being cleaned up
Date Mon, 12 May 2014 06:26:15 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-5706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Karthik Kambatla updated MAPREDUCE-5706:

       Resolution: Fixed
    Fix Version/s: 0.22.1
     Hadoop Flags: Reviewed
           Status: Resolved  (was: Patch Available)

Thanks Robert. Just committed this to branch-0.22.

> toBeDeleted parent directories aren't being cleaned up
> ------------------------------------------------------
>                 Key: MAPREDUCE-5706
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5706
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 0.22.0
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>             Fix For: 0.22.1
>         Attachments: MAPREDUCE-5706.patch
> When security is enabled on 0.22, MRASyncDiskService doesn't always delete the parent
directories under {{toBeDeleted}}.
> MRAsyncDiskService goes through {{toBeDeleted}} and creates "tasks" to delete the directories
under there using the LinuxTaskController. It chooses which user to run as by looking at who
owns that directory.
> For example:
> {noformat}
> ls -al /mapred/local/toBeDeleted/2013-07-05_05-37-49.052_0
> total 12
> drwxr-xr-x 3 mapred mapred 4096 Jul  5 05:37 .
> drwxr-xr-x 5 mapred mapred 4096 Dec 19 10:15 ..
> drwxr-s--- 4 test   mapred 4096 Jul  2 02:54 test
> {noformat}
> It would create a task to use "test" user to delete /mapred/local/toBeDeleted/2013-07-05_05-37-49.052_0/test
(there could be more in there for other users). It then creates a task to use "mapred" user
to delete /mapred/local/toBeDeleted/2013-07-05_05-37-49.052_0.
> So, the problem is that we normally configure "mapred" to not be allowed by the LinuxTaskController
in the /etc/hadoop/conf.cloudera.mapreduce1/taskcontroller.cfg.  The permissions on the toBeDeleted
dir is drwxr-xr-x mapred:mapred, which means that only "mapred" can delete things in it (i.e.
the timestamped dirs).  However, the MRAsyncDiskService is already running as the mapred user,
so there's no reason to use the LinuxTaskController for impersonation anyway; we can directly
do it from the Java code.
> Another issue is that {{MRAsyncDiskService#deletePathsInSecureCluster}} expects an absolute
file path (e.g. {{/mapred/local/toBeDeleted/2013-07-05_05-37-49.052_0}}, but {{MRAsyncDiskService#moveAndDeleteRelativePath}}
passes in a relative path (e.g. {{toBeDeleted/2013-07-05_05-37-49.052_0}}).  

This message was sent by Atlassian JIRA

View raw message