hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik Kambatla (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-4843) When using DefaultTaskController, JobLocalizer not thread safe
Date Mon, 04 Feb 2013 22:16:13 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-4843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Karthik Kambatla updated MAPREDUCE-4843:

    Attachment: mr-4843.patch

Uploading the patch from MAPREDUCE-4964 as that solves this issue in a simpler/cleaner way.
The discussion on that JIRA has all the details.

Applied the patch to latest branch-1 and it applies cleanly. Also, verified TestJobLocalizer
> When using DefaultTaskController, JobLocalizer not thread safe
> --------------------------------------------------------------
>                 Key: MAPREDUCE-4843
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4843
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 1.1.1
>            Reporter: zhaoyunjiong
>            Assignee: Karthik Kambatla
>            Priority: Critical
>         Attachments: MAPREDUCE-4843-branch-1.1.patch, mr-4843.patch
> In our cluster, some times job will failed due to below exception:
> 2012-12-03 23:11:54,811 WARN org.apache.hadoop.mapred.TaskTracker: Error initializing
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/$username/jobcache/job_201212031626_1115/job.xml
in any of the configured local directories
> 	at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:424)
> 	at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160)
> 	at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1175)
> 	at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1058)
> 	at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2213)
> The root cause is JobLocalizer is not thread safe.
> In DefaultTaskController.initializeJob method:
>      JobLocalizer localizer = new JobLocalizer((JobConf)getConf(), user, jobid);
> but in JobLocalizer, it just simply keep the reference of the conf.
> When two TaskLauncher threads(mapLauncher and reduceLauncher) try to initializeJob at
same time, it will have two JobLocalizer, but only one conf instance.
> So some times ttConf.setStrings(JOB_LOCAL_CTXT, localDirs) will reset previous job's
> Then it will cause the previous job's job.xml stored at another user's dir.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message