hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sanjay Dahiya (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-239) job tracker WI drops jobs after 24 hours
Date Wed, 30 Aug 2006 08:56:24 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-239?page=comments#action_12431511 ] 
            
Sanjay Dahiya commented on HADOOP-239:
--------------------------------------

If we keep a single history file for jobtracker we will run into a very large history files
very soon, specially when there are large number of small tasks. On the other hand if we rollover
the file every day then job start and end events for longer jobs or the jobs that start on
the day end will be in different log files. We will still be able to see daily activity but
drilling into jobs will be a problem as we will have to look up in multiple huge file for
job specifc events. 
Yoram and I discussed over IM and here is current approach. 

We maintain a master file for all jobs - this file contains only job start/finish events along
with no of tasks failed at finish. If the JobTracker dies before finishing a job then we dont
log number of failed taks in this file. 

For each job we create a separate history log file and this file contains task/taskattempt
start and finish times along with failures if any. 

The master index is rolledover every month, and during rollover we look for all jobs that
have not finished and move them to the new file and discard old jobs. The detailed history
log for jobs older than a month will get deleted. 

The master index will be used to render the main JSP for job history, clicking on the job
will cause corresponding job file to be loaded / parsed and displayed on respective JSPs.


Start time of the jobtracker is used as an extra key to uniquely identify jobs since same
jobids are used when jobtracker restarts. 

We will not have any host specific view of tasks in this case. 

> job tracker WI drops jobs after 24 hours
> ----------------------------------------
>
>                 Key: HADOOP-239
>                 URL: http://issues.apache.org/jira/browse/HADOOP-239
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Yoram Arnon
>         Assigned To: Sanjay Dahiya
>            Priority: Minor
>
> The jobtracker's WI, keeps track of jobs executed in the past 24 hours.
> if the cluster was idle for a day (say Sunday) it drops all its history.
> Monday morning, the page is empty.
> Better would be to store a fixed number of jobs (say 10 each of succeeded and failed
jobs).
> Also, if the job tracker is restarted, it loses all its history.
> The history should be persistent, withstanding restarts and upgrades.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message