Return-Path: Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: (qmail 99187 invoked from network); 13 Jul 2009 05:18:30 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 13 Jul 2009 05:18:30 -0000 Received: (qmail 56290 invoked by uid 500); 13 Jul 2009 05:18:39 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 56252 invoked by uid 500); 13 Jul 2009 05:18:39 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 56242 invoked by uid 99); 13 Jul 2009 05:18:39 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Jul 2009 05:18:39 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Jul 2009 05:18:36 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id C8065234C004 for ; Sun, 12 Jul 2009 22:18:14 -0700 (PDT) Message-ID: <737525313.1247462294805.JavaMail.jira@brutus> Date: Sun, 12 Jul 2009 22:18:14 -0700 (PDT) From: "Hemanth Yamijala (JIRA)" To: common-issues@hadoop.apache.org Subject: [jira] Commented: (HADOOP-4491) Per-job local data on the TaskTracker node should have right access-control MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730220#action_12730220 ] Hemanth Yamijala commented on HADOOP-4491: ------------------------------------------ Some other comments on the code: - The code models some of the commands passed to the taskcontroller as 'first task' for JVM or 'not first task' for JVM. This complicates the contract between the TT and the task-controller. I think at a minimum these decisions should be restricted only to the JVM manager, and the contract between the TT and task-controller should be handled by modelling new commands like FINALIZE_JVM and FINALIZE_TASK. - The patch introduces a log directory argument to the task-controller. I think this is not required as there is only one value for the hadoop.log.dir for all tasks. - Code related to creation of work dirs is removed from TaskRunner. Where is it created now ? - Permissions for job directory is changed multiple times - once per each task. - finalizeTaskDirs (or parts of it) needs to be synchronized. - In the case of jvm reuse, we still need to make the output available. Because it seems the task completion event is sent as soon as the task is done not after jvm exits. This is only a theoretical case possibly, but it still will be good to keep the code paths identical. - Path permissions should be taken care of ? so, if mapred.local.dir doesn't have execute permissions for others, we need to set them. - In setup, we seem to be setting permissions even if directory creation fails. also shouldn't we set permissions even if it exists.. so that it is right as per our requirement. - Didn't understand the purpose of initStatus. Since it starts out being true, wouldn't it always remain true ? - Cache directory probably doesn't need 777 because it is not written to by the tasks. We can probably retain this if HADOOP-4490 set like permissions, since this will be addressed in HADOOP-4493. - getBaseIntermediateOutputDir seems an overkill if it is just returning a constant. - Changes in SpillRecord seem to be unnecessary. Some nits: - There are some else blocks without code, but with a code comment explaining why there's no else block. While useful, it is somewhat unconventional. I would recommend the reason be moved to a comment starting the if block itself. - TODOs in the patch must be discussed and resolved. - TaskController.FILE_PERMISSIONS doesn't seem to be used anywhere. - Rename finalizeTaskDirs as finalizeTask - it matches with the naming convention for the other apis in taskcontroller. - Add a comment about why task-work is 755. - mapred.child.local.dir has an extra comma at the end. > Per-job local data on the TaskTracker node should have right access-control > --------------------------------------------------------------------------- > > Key: HADOOP-4491 > URL: https://issues.apache.org/jira/browse/HADOOP-4491 > Project: Hadoop Common > Issue Type: Sub-task > Components: security > Reporter: Arun C Murthy > Assignee: Vinod K V > Attachments: HADOOP-4491-20090623-common.1.txt, HADOOP-4491-20090623-mapred.1.txt, HADOOP-4491-20090703-common.1.txt, HADOOP-4491-20090703-common.txt, HADOOP-4491-20090703.1.txt, HADOOP-4491-20090703.txt, HADOOP-4491-20090707-common.txt, HADOOP-4491-20090707.txt > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.