hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hemanth Yamijala (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5327) Job files for a job failing because of ACLs are not clean from the system directory
Date Fri, 06 Mar 2009 10:18:01 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679548#action_12679548
] 

Hemanth Yamijala commented on HADOOP-5327:
------------------------------------------

I think this patch should also add the system directory to the clean up thread in the code
path where job submission fails due to ACLs. In a majority of the cases, this action alone
will prevent the problem from happening in the first place. However, this is only in addition
to the changes in the patch as they are still needed to take care of cases where the job tracker
could be restarted before the clean up thread has had a chance to delete the system directory
completely.

Regarding cleanup, there seem to be two different cases here:

- The job was never submitted in the first place
- The job was running in the first place, and after restart it can no longer run because the
ACLs were changed. 

I think the patch is cleanly handling the first case (with the comments incorporated). In
the second case, ideally the job should be killed by the JobTracker so that all parts related
to the job (system directory, running tasks, cleanup task, etc) are cleaned up properly. I
am thinking handling the second case (which ideally should be rare) should be a separate jira.
Thoughts ?

> Job files for a job failing because of  ACLs are not clean from the system directory
> ------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5327
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5327
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Karam Singh
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-5327-v2.3.patch
>
>
> Jobs which failed because of ACLs gets added during JT restart recovery 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message