hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amar Kamat (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-693) Conf files not moved to "done" subdirectory after JT restart
Date Thu, 02 Jul 2009 06:38:47 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726351#action_12726351
] 

Amar Kamat commented on MAPREDUCE-693:
--------------------------------------

bq.  The old conf files remain in the history folder and fail to be moved to "done" subdirectory
There is no need to move the conf file to the done folder. In this case the job is run as
a new job and hence a new conf file is created for this job. The jobhistory file gets deleted
as it is required for recovery (checkpoint process). The conf file is doesnt play any role
in the recovery process. Here is what is happening 
# jobtracker starts with id _id1_
# job job1 is submitted and creates history file hostname_id1_job1_user_jobname and conf file
as hostname_id1_job1_conf.xml
# jobtracker restart with id _id2_
# jobtracker tries to recover the job. There are 2 possibilities here :
 ## If the job-initialization thread inits the job before the recovery-manager picks up the
job for recovery then the new filename would be  hostname_id1_job1_user_jobname.recover and
the conf file would be  hostname_id1_job1_conf.xml. In such a case there wont be any garbage
left in the history folder.
 ## If the recovery-manager picks up the job first before the init-thread then it will assume
that there is nothing to recover and will delete hostname_id1_job1_user_jobname (leaving 
hostname_id1_job1_conf.xml). When the job inits, it will take a new filename i.e  hostname_id2_job1_user_jobname
and  hostname_id2_job1_conf.xml. Only in this case the conf file ( hostname_id1_job1_conf.xml)
is left behind in the history folder. 

AFAIK this is a timing issue. I think a proper fix for all this corner cases is MAPREDUCE-11.
Thoughts?

> Conf files not moved to "done" subdirectory after JT restart
> ------------------------------------------------------------
>
>                 Key: MAPREDUCE-693
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-693
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobtracker
>    Affects Versions: 0.20.1
>            Reporter: Ramya R
>            Priority: Minor
>             Fix For: 0.20.1
>
>
> After MAPREDUCE-516, when a job is submitted and the JT is restarted (before job files
have been written) and the job is killed after recovery, the conf files fail to be moved to
the "done" subdirectory.
> The exact scenario to reproduce this issue is:
> * Submit a job
> * Restart JT before anything is written to the job files
> * Kill the job
> * The old conf files remain in the history folder and fail to be moved to "done" subdirectory

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message