hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod K V (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MAPREDUCE-927) Cleanup of task-logs should happen in TaskTracker instead of the Child
Date Mon, 01 Mar 2010 08:26:06 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Vinod K V updated MAPREDUCE-927:

    Status: Open  (was: Patch Available)

Looked at the patch. Have some comments:

 - Creation of the {{localFS}} should not be in a static block. In the past also, we did this
and realized it creates a kind of circular initialization of loggers and results in NPE while
creating log objects which can be seen in task-logs. The current way of creation of {{localFS}}
should be retained.
 - We can move {{DEFAULT_USER_LOG_RETAIN_HOURS}} to {{TaskLogsCleanupThread}}.
 - Shall we rename {{getJobUserLogDir() to a simpler {{getJobDir()}}? And {{getBaseDir(String)}}
to {{getAttemptDir(String)}} to be clear? I think it's ok like this as {{TaskLog.getJobDir()}}
clearly means it is job-dir for logs.
 - Also, I think it's high time {{TaskLog.java}} is made @InterfaceAudience.Private.

 - Rename the class to {{TaskLogsMonitor}}, so that we are consistent going forward with MAPREDUCE-1100.
 - Set the audience visibility of the class to private?
 - {{threadSleepTime}} is not configurable. May not be a public documented configuration,
but still we need one.
 - Constructor: Shouldn't the volume on which the disk-service works be {{getUserLogsDir()}}
instead of {{getBaseLogDir()}}? The correctness is not lost with the current patch as we are
always passing absolute paths to the disk-service, but I think we should change it anyways.
Also can't we simply construct a local-filesystem here itself, instead of calling TaskLog.getLocalFileSystem()?
This will mostly avoid your changes regarding my first comment in TaskLog.java above.
 - For the sake of correctness, in {{removeOldUserLogs()}}, the job should be removed only
after the deletion of the log file.
 - Throughout the class, we should use a clock instead of directly using {{System.currentMillis()}}.
This will better the testing.
 - Shouldn't {{clearOldUserLogs()}} be done as part of constructor itself? This is the pattern
that {{MRAsyncDiskService}} uses, for example.

 - The {{logscleanup}} thread is not joined/killed in the {{TaskTracker.close()}}. So, there
will be zombie threads in the system on a re-init and may well interfere with the new thread.
 - Shouldn't we be setting 770-user-mapred permissions on userlogs/$jobid as part of job-localization?
Granted 711 is enough for now, but this slightly deviates from the current security on the
directories - we always have the most secure permissions possible on the files/dirs. If we
do this, TestLocalizationWithLinuxTaskController should also test the permissions of the joblogdir.
 - The following code fragment at +1035 can be moved out of {{localizeJobFiles()}} into a
new method {{initializeJobLogDir()}} and can be called directly from {{localizeJob()}}.
 - One case is still not handled: TT reinits while the Job is still running. After re-init,
no tasks of the running job arrive at this TT. Retain-hours after re-init TT removes the job's
tasks' logs even though the job is still running elsewhere. Doing this will need TT to specially
communicate with JT and from what I understand, we are not doing it here. If that is the case,
can we simply add a test for this too in {{TestUserLogCleanup}}?

 - Use {{TaskTracker.initializeJobLogDir()}} mentioned above in {{setupJobLogDirectory()}}.
 - Once we use a clock in TaskLogCleanupThread,
    -- the tests can be modified to use this.
    -- we can test what exactly the modified retainTimeStamp is after re-init.
 - We should create some attempt dirs also in the joblog dir with appropriate permissions
and verify the proper cleanup.
 - Document(javadoc) the two tests?

 - This needs to be fixed to reflect the new directory hierarchy of the logs.

 - Should TaskLogs section be changed to explicitly specify the new directory hierarchy?

Cancelling the patch for the sake of these changes.

> Cleanup of task-logs should happen in TaskTracker instead of the Child
> ----------------------------------------------------------------------
>                 Key: MAPREDUCE-927
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-927
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: security, tasktracker
>    Affects Versions: 0.21.0
>            Reporter: Vinod K V
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.21.0
>         Attachments: patch-927-1.txt, patch-927.txt
> Task logs' cleanup is being done in Child now. This is undesirable atleast for two reasons:
1) failures while cleaning up will affect the user's tasks, and 2) the task's wall time will
get affected due to operations that TT actually should own.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message