hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo (Nicholas), SZE (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (HADOOP-3182) JobClient creates submitJobDir with SYSTEM_DIR_PERMISSION ( rwx-wx-wx)
Date Mon, 07 Apr 2008 22:55:26 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12586561#action_12586561
] 

szetszwo edited comment on HADOOP-3182 at 4/7/08 3:53 PM:
------------------------------------------------------------------------

Below is the clean-up trace for wordcount :

h4. Step c1: Task.saveTaskOutput
- FileSystem.delete <job-output-dir>/_temporary/_task_200804071355_0001_r_000000_0 by
JobTracker as user_account
	at org.apache.hadoop.mapred.Task.saveTaskOutput(Task.java:557)
	at org.apache.hadoop.mapred.JobTracker$TaskCommitQueue.run(JobTracker.java:2208)

h4. Step c2: JobInProgress.garbageCollect
- FileSystem.delete <job-dir> by JobTracker as user_account
	at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1637)
	at org.apache.hadoop.mapred.JobInProgress.isJobComplete(JobInProgress.java:1396)
	at org.apache.hadoop.mapred.JobInProgress.completedTask(JobInProgress.java:1357)
	at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:565)
	at org.apache.hadoop.mapred.JobTracker$TaskCommitQueue.run(JobTracker.java:2270)
<job-dir> is obtained by profile.getJobFile()).getParent()

- FileSystem.delete <job-dir> *again* by JobTracker as user_account
	at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1642)
	at org.apache.hadoop.mapred.JobInProgress.isJobComplete(JobInProgress.java:1396)
	at org.apache.hadoop.mapred.JobInProgress.completedTask(JobInProgress.java:1357)
	at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:565)
	at org.apache.hadoop.mapred.JobTracker$TaskCommitQueue.run(JobTracker.java:2270)
<job-dir> is obtained by new Path(conf.getSystemDir(), jobId)

*Question:*  Are profile.getJobFile()).getParent() and new Path(conf.getSystemDir(), jobId)
supposed to be different?

- FileUtil.fullyDelete <job-output-dir>/_temporary as user_account
	at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1650)
	at org.apache.hadoop.mapred.JobInProgress.isJobComplete(JobInProgress.java:1396)
	at org.apache.hadoop.mapred.JobInProgress.completedTask(JobInProgress.java:1357)
	at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:565)
	at org.apache.hadoop.mapred.JobTracker$TaskCommitQueue.run(JobTracker.java:2270)

*Question:*  What is the usage of FileUtil.fullyDelete?  Is it the same as FileSystem.delete(path,
recursive=true)?

      was (Author: szetszwo):
    Below is the clean-up trace for wordcount :

h4. Step c1: Task.saveTaskOutput
- FileSystem.delete <job-output-dir>/_temporary/_task_200804071355_0001_r_000000_0 by
JobTracker as user_account
	at org.apache.hadoop.mapred.Task.saveTaskOutput(Task.java:557)
	at org.apache.hadoop.mapred.JobTracker$TaskCommitQueue.run(JobTracker.java:2208)

h4. Step c2: JobInProgress.garbageCollect
- FileSystem.delete <job-dir> by JobTracker as user_account
	at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1637)
	at org.apache.hadoop.mapred.JobInProgress.isJobComplete(JobInProgress.java:1396)
	at org.apache.hadoop.mapred.JobInProgress.completedTask(JobInProgress.java:1357)
	at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:565)
	at org.apache.hadoop.mapred.JobTracker$TaskCommitQueue.run(JobTracker.java:2270)
<job-dir> is obtained from profile.getJobFile()).getParent()

- FileSystem.delete <job-dir> *again* by JobTracker as user_account
	at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1642)
	at org.apache.hadoop.mapred.JobInProgress.isJobComplete(JobInProgress.java:1396)
	at org.apache.hadoop.mapred.JobInProgress.completedTask(JobInProgress.java:1357)
	at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:565)
	at org.apache.hadoop.mapred.JobTracker$TaskCommitQueue.run(JobTracker.java:2270)
<job-dir> is obtained from new Path(conf.getSystemDir(), jobId)

*Question:*  Are profile.getJobFile()).getParent() and new Path(conf.getSystemDir(), jobId)
supposed to be different?

- FileUtil.fullyDelete <job-output-dir>/_temporary as user_account
	at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1650)
	at org.apache.hadoop.mapred.JobInProgress.isJobComplete(JobInProgress.java:1396)
	at org.apache.hadoop.mapred.JobInProgress.completedTask(JobInProgress.java:1357)
	at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:565)
	at org.apache.hadoop.mapred.JobTracker$TaskCommitQueue.run(JobTracker.java:2270)

*Question:*  Why is the usage of FileUtil.fullyDelete?  Is it the same as FileSystem.delete(path,
recursive=true)?
  
> JobClient creates submitJobDir with SYSTEM_DIR_PERMISSION ( rwx-wx-wx)
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-3182
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3182
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.2
>            Reporter: lohit vijayarenu
>            Assignee: Tsz Wo (Nicholas), SZE
>            Priority: Blocker
>             Fix For: 0.16.3, 0.17.0
>
>
> JobClient creates submitJobDir with SYSTEM_DIR_PERMISSION ( rwx-wx-wx ) which causes
problem while sharing a cluster.
> Consider the case where userA starts jobtracker/tasktrackers and userB submits a job
to this cluster. When userB creates submitJobDir it is created with rwx-wx-wx which cannot
be read by tasktracker started by userA

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message