hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod K V (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-927) Cleanup of task-logs should happen in TaskTracker instead of the Child
Date Mon, 08 Mar 2010 07:29:27 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842561#action_12842561

Vinod K V commented on MAPREDUCE-927:

Looks close. Two major comments:
 - Job's log directory need not be 770 in all cases. It can be 570 when the job-submitter
is not the same as the TT user. So, this needs a merge with MAPREDUCE-890(which is close to
the finish line).
 - After this patch, with a job specific directory for logs, jobacls.xml needs to be moved
into job-dir. Will file a separate JIRA issue for this.

Minor code level comments follow:

 -   The comment for DEFAULT_THREAD_SLEEP_TIME should read 1 hour instead of 1day.
 - I think we can name the APIs in this class a bit better. removeJobFromLogDeletion can be
unMarkJobFromLogDeletion? similarly addJobLogsForDeletion as markJobForLogDeletion? removeOldUserLogs
to processCompletedJobs?
 - Why maintain two different APIs - addJobLogsForDeletion and addJobForLogDeletion? These
two can be merged?
 - In run() method, sleep should be moved into a finally block?
 - {{addJobLogsForDeletion()}} should not itself take jobcompletion time, it should instead
be taken from clock..

 - {{Localizer.initializeJobUserLogDir()}} can be renamed to {{initializeJobLogDir()}} and
can be helped with some javadoc
 - I actually meant moving {{TaskTracker.initializeJobLogDir}} out of {{localizeJobFiles}}
into {{localizeJob()}} itself. Is it possible? Also comment inside this method needs grammatical
 - No need for documenting {{mapreduce.tasktracker.userlogcleanup.sleeptime}} ?
 - {{TestTaskTrackerLocalization.verifyUserLogsCleanup()}}: shouldn't you be setting the clock
and use an inline cleaner? This is mainly to avoid parallel execution by the thread and the
test-case. Otherwise it is possible that we run in errors and test failures (?).

 - Should use clock and inline clear similar to the last comment above
 - Can directly use UtilsForTest.FakeClock instead of another FakeClock.
 - Can we augment the current tests to check that TOBEDELETED isn't actually deleted?
 - Rename the two tests appropriately?
 - Should use clock.advance() to improve testUserLogCleanup().

> Cleanup of task-logs should happen in TaskTracker instead of the Child
> ----------------------------------------------------------------------
>                 Key: MAPREDUCE-927
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-927
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: security, tasktracker
>    Affects Versions: 0.21.0
>            Reporter: Vinod K V
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.22.0
>         Attachments: patch-927-1.txt, patch-927-2.txt, patch-927.txt
> Task logs' cleanup is being done in Child now. This is undesirable atleast for two reasons:
1) failures while cleaning up will affect the user's tasks, and 2) the task's wall time will
get affected due to operations that TT actually should own.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message