Return-Path: Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: (qmail 86132 invoked from network); 6 Dec 2009 06:37:46 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 6 Dec 2009 06:37:45 -0000 Received: (qmail 30704 invoked by uid 500); 6 Dec 2009 06:37:45 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 30634 invoked by uid 500); 6 Dec 2009 06:37:45 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 30624 invoked by uid 99); 6 Dec 2009 06:37:44 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 06 Dec 2009 06:37:44 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 06 Dec 2009 06:37:42 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id B9078234C04C for ; Sat, 5 Dec 2009 22:37:20 -0800 (PST) Message-ID: <406969435.1260081440743.JavaMail.jira@brutus> Date: Sun, 6 Dec 2009 06:37:20 +0000 (UTC) From: "Hemanth Yamijala (JIRA)" To: mapreduce-issues@hadoop.apache.org Subject: [jira] Commented: (MAPREDUCE-896) Users can set non-writable permissions on temporary files for TT and can abuse disk usage. In-Reply-To: <795470237.1250829494816.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/MAPREDUCE-896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786575#action_12786575 ] Hemanth Yamijala commented on MAPREDUCE-896: -------------------------------------------- I looked at the Y! 20 patch. Some comments: - TaskTracker.buildPathForDeletion need not be public. - Was there a need to change CleanupQueue.addToQueue to take a FileSystem as argument instead of Configuration ? It has caused more changes than required by this patch - like in JobTracker and JobInProgress. Can we retain the original API and pass in a Configuration as before ? - When adding a task directory to delete, we are adding paths from all the local directories instead of just the one where files for the task are actually created. At a minimum, this is more work being done than necessary. More importantly, I don't know if there are any side effects this will cause. We can check which among the local directories the path belongs to (by doing a startsWith on the path) and all only that I think. - Shouldn't getLocalDirs take the tasktracker's configuration always ? In which case, it doesn't need to take the JobConf as a parameter, but can use fConf. - The log statements in CleanupQueue.PathCleanupThread.run are printing context.path which will only be the mapred local dir. We actually need the full path, otherwise the log statements could be confusing. Indeed, I would suggest a slight refactoring of the PathDeletionContext class, because as it exists currently we have one mode or the other that works - either a fullPath is provided or the path is built from other bits of data - like jobId, taskId etc. So, I would suggest something along the following lines: {code} class PathDeletionContext { String fullPath; FileSystem fs; protected String getPathForDeletion() { return fullPath; } protected void enablePathForCleanup() { // do nothing } } class TaskControllerPathDeletionContext extends PathDeletionContext { String user; String jobId; String taskId; boolean isCleanupTask; boolean isWorkDir; TaskController taskController; Path p; @Override protected String getPathForDeletion() { TaskTracker.buildPathForDeletion(this); } @Override protected void enablePathForCleanup() { taskController.enablePathForCleanup(this); } } {code} Then we can use PathDeletionContext in all cases where we don't need to use the taskController and the sub-class in other cases. CleanupQueue will naturally take and store PathDeletionContext objects. getPathForDeletion can be called to get the final path for deletion. I feel this design is cleaner. Thoughts ? - DefaultTaskController.enableTaskForCleanup should be package private. - In other APIs of LinuxTaskController - like buildLaunchTaskArgs, we find out if the task is a cleanup task and adjust paths directly. I think we can do the same thing for the new command also. This is not less secure, because we are still constructing the full path from the command args, but we abstract the task-controller from details like cleanup task. It is less clear whether the same thing should be done for workDir (i.e. should we append that to taskid in LinuxTaskController itself.) For that we may need a flag still, but I am OK if that is also resolved in LinuxTaskController itself and we completely eliminate flags to pass to task-controller. - The List of args in buildChangePathPermissionsArgs should be of the right size. (It's not 5). Also, I think it is useful to retain the order of commands the same. i.e. Let the mapred local dir be the first argument, then job-id, then task-id. - I think we must allocate the exact amount of memory required in build_dir_path. This can be done by defining a format string like TT_LOCAL_TASK_SCRIPT_PATTERN and then summing the lengths of this string, and the arguments like jobid, taskid, mapred local dir etc. Then we can use snprintf to build the path instead of multiple (unsafe) strcat and strcpy calls. Again, please look at get_task_file_path for an example. - Return values of calls like malloc should all be checked. When this is done, calls to build_dir_path can fail, which must also be checked. - In TaskRunner.deleteDirContents, I think if we get an InterruptedException, we should return immediately. Because otherwise, the operation is not really interrupted and it can get stuck permanently. - The intent of the testcase in TestChildTaskDirs is nice. But I am worried that since directory cleanup happens asynchronously, this might fail due to timing issues (like the TODO in the comment says). One option could be to use an inline directory cleaner. Can we try that ? - Should we also verify that the taskattemptdir is also cleaned up ? - There are some TODOs in the tests, can you please remove them after addressing the concerns ? > Users can set non-writable permissions on temporary files for TT and can abuse disk usage. > ------------------------------------------------------------------------------------------ > > Key: MAPREDUCE-896 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-896 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tasktracker > Affects Versions: 0.21.0 > Reporter: Vinod K V > Assignee: Ravi Gummadi > Fix For: 0.21.0 > > Attachments: MR-896.patch, MR-896.v1.patch, y896.v1.patch > > > As of now, irrespective of the TaskController in use, TT itself does a full delete on local files created by itself or job tasks. This step, depending upon TT's umask and the permissions set by files by the user, for e.g in job-work/task-work or child.tmp directories, may or may not go through successful completion fully. Thus is left an opportunity for abusing disk space usage either accidentally or intentionally by TT/users. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.