hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod K V (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-4491) Per-job local data on the TaskTracker node should have right access-control
Date Thu, 16 Jul 2009 07:03:14 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Vinod K V updated HADOOP-4491:
------------------------------

    Attachment: HADOOP-4491-20090716-mapred.txt

Uploading mapred part of the patch that incorporates most of the review comments.
{quote}
 - In the case of jvm reuse, we still need to make the output available. Because it seems
the task completion event is sent as soon as the task is done not after jvm exits. This is
only a theoretical case possibly, but it still will be good to keep the code paths identical.
 - finalizeTaskDirs (or parts of it) needs to be synchronized.
 - Path permissions should be taken care of ? so, if mapred.local.dir doesn't have execute
permissions for others, we need to set them.
 - In setup, we seem to be setting permissions even if directory creation fails. also shouldn't
we set permissions even if it exists.. so that it is right as per our requirement.
 - Cache directory probably doesn't need 777 because it is not written to by the tasks. We
can probably retain this if HADOOP-4490 set like permissions, since this will be addressed
in HADOOP-4493.
 - getBaseIntermediateOutputDir seems an overkill if it is just returning a constant.
 - Changes in SpillRecord seem to be unnecessary.
 - TaskController.FILE_PERMISSIONS doesn't seem to be used anywhere.
 - Rename finalizeTaskDirs as finalizeTask - it matches with the naming convention for the
other apis in taskcontroller.
 - Add a comment about why task-work is 755.
 - mapred.child.local.dir has an extra comma at the end.
{quote}
     -- Done

    * Permissions for job directory is changed multiple times - once per each task.
     -- Done. Added a new INITIALIZE_JOB command.

    * Code related to creation of work dirs is removed from TaskRunner. Where is it created
now ?
      -- This is actually not needed. The work dirs are already created before at TaskTracker.localizeTask()
 (creation of cwd)

    * Didn't understand the purpose of initStatus. Since it starts out being true, wouldn't
it always remain true ?
     -- initStatus is used to track if directory creation is failing on all the disks. It
was incorrectly initialized to true. Fixed this.

Things to be done:
{quote}
    - The code models some of the commands passed to the taskcontroller as 'first task' for
JVM or 'not first task' for JVM. This complicates the contract between the TT and the task-controller.
I think at a minimum these decisions should be restricted only to the JVM manager, and the
contract between the TT and task-controller should be handled by modelling new commands like
FINALIZE_JVM and FINALIZE_TASK.
    - The patch introduces a log directory argument to the task-controller. I think this is
not required as there is only one value for the hadoop.log.dir for all tasks.
    - TODOs in the patch must be discussed and resolved.      
    - Rename finalizeTaskDirs as finalizeTask. To be done in the task-controller code

1. Setting secure permissions by default vs doing so only in the LinuxTaskController
2. Approach for sharing common files between TT and child
3. Sandboxing task runtime environment by changing values of mapred.local.dir for the child
{quote}


> Per-job local data on the TaskTracker node should have right access-control
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-4491
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4491
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: security
>            Reporter: Arun C Murthy
>            Assignee: Vinod K V
>         Attachments: HADOOP-4491-20090623-common.1.txt, HADOOP-4491-20090623-mapred.1.txt,
HADOOP-4491-20090703-common.1.txt, HADOOP-4491-20090703-common.txt, HADOOP-4491-20090703.1.txt,
HADOOP-4491-20090703.txt, HADOOP-4491-20090707-common.txt, HADOOP-4491-20090707.txt, HADOOP-4491-20090716-mapred.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message