hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod K V (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-927) Cleanup of task-logs should happen in TaskTracker instead of the Child
Date Mon, 08 Mar 2010 07:29:27 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842561#action_12842561
] 

Vinod K V commented on MAPREDUCE-927:
-------------------------------------

Looks close. Two major comments:
 - Job's log directory need not be 770 in all cases. It can be 570 when the job-submitter
is not the same as the TT user. So, this needs a merge with MAPREDUCE-890(which is close to
the finish line).
 - After this patch, with a job specific directory for logs, jobacls.xml needs to be moved
into job-dir. Will file a separate JIRA issue for this.

Minor code level comments follow:

UserLogCleaner
 -   The comment for DEFAULT_THREAD_SLEEP_TIME should read 1 hour instead of 1day.
 - I think we can name the APIs in this class a bit better. removeJobFromLogDeletion can be
unMarkJobFromLogDeletion? similarly addJobLogsForDeletion as markJobForLogDeletion? removeOldUserLogs
to processCompletedJobs?
 - Why maintain two different APIs - addJobLogsForDeletion and addJobForLogDeletion? These
two can be merged?
 - In run() method, sleep should be moved into a finally block?
 - {{addJobLogsForDeletion()}} should not itself take jobcompletion time, it should instead
be taken from clock..

Miscellaneous
 - {{Localizer.initializeJobUserLogDir()}} can be renamed to {{initializeJobLogDir()}} and
can be helped with some javadoc
 - I actually meant moving {{TaskTracker.initializeJobLogDir}} out of {{localizeJobFiles}}
into {{localizeJob()}} itself. Is it possible? Also comment inside this method needs grammatical
fixes.
 - No need for documenting {{mapreduce.tasktracker.userlogcleanup.sleeptime}} ?
 - {{TestTaskTrackerLocalization.verifyUserLogsCleanup()}}: shouldn't you be setting the clock
and use an inline cleaner? This is mainly to avoid parallel execution by the thread and the
test-case. Otherwise it is possible that we run in errors and test failures (?).

TestUserLogsCleanup
 - Should use clock and inline clear similar to the last comment above
 - Can directly use UtilsForTest.FakeClock instead of another FakeClock.
 - Can we augment the current tests to check that TOBEDELETED isn't actually deleted?
 - Rename the two tests appropriately?
 - Should use clock.advance() to improve testUserLogCleanup().

> Cleanup of task-logs should happen in TaskTracker instead of the Child
> ----------------------------------------------------------------------
>
>                 Key: MAPREDUCE-927
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-927
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: security, tasktracker
>    Affects Versions: 0.21.0
>            Reporter: Vinod K V
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.22.0
>
>         Attachments: patch-927-1.txt, patch-927-2.txt, patch-927.txt
>
>
> Task logs' cleanup is being done in Child now. This is undesirable atleast for two reasons:
1) failures while cleaning up will affect the user's tasks, and 2) the task's wall time will
get affected due to operations that TT actually should own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message