hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ravi Prakash (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-4088) Task stuck in JobLocalizer prevented other tasks on the same node from committing
Date Fri, 30 Mar 2012 16:16:30 GMT
Task stuck in JobLocalizer prevented other tasks on the same node from committing
---------------------------------------------------------------------------------

                 Key: MAPREDUCE-4088
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4088
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: mrv1
    Affects Versions: 0.20.205.0
            Reporter: Ravi Prakash
            Priority: Critical


We saw that as a result of HADOOP-6963, one task was stuck in this

Thread 23668: (state = IN_NATIVE)
 - java.io.UnixFileSystem.getBooleanAttributes0(java.io.File) @bci=0 (Compiled frame; information
may be imprecise)
 - java.io.UnixFileSystem.getBooleanAttributes(java.io.File) @bci=2, line=228 (Compiled frame)
 - java.io.File.exists() @bci=20, line=733 (Compiled frame)
 - org.apache.hadoop.fs.FileUtil.getDU(java.io.File) @bci=3, line=446 (Compiled frame)
 - org.apache.hadoop.fs.FileUtil.getDU(java.io.File) @bci=52, line=455 (Compiled frame)
 - org.apache.hadoop.fs.FileUtil.getDU(java.io.File) @bci=52, line=455 (Compiled frame)
....
.... TONS MORE OF THIS SAME LINE
 - org.apache.hadoop.fs.FileUtil.getDU(java.io.File) @bci=52, line=455 (Compiled frame)
.....
.....
 - org.apache.hadoop.fs.FileUtil.getDU(java.io.File) @bci=52, line=455 (Compiled frame)
 - org.apache.hadoop.fs.FileUtil.getDU(java.io.File) @bci=52, line=455 (Interpreted frame)
ne=451 (Interpreted frame)
 - org.apache.hadoop.mapred.JobLocalizer.downloadPrivateCacheObjects(org.apache.hadoop.conf.Configuration,
java.net.URI[], org.apache.hadoop.fs.Path[], long[], boolean[], boolean) @bci=150, line=324
(Interpreted frame)
 - org.apache.hadoop.mapred.JobLocalizer.downloadPrivateCache(org.apache.hadoop.conf.Configuration)
@bci=40, line=349 (Interpreted frame) 51, line=383 (Interpreted frame)
 - org.apache.hadoop.mapred.JobLocalizer.runSetup(java.lang.String, java.lang.String, org.apache.hadoop.fs.Path,
org.apache.hadoop.mapred.TaskUmbilicalProtocol) @bci=46, line=477 (Interpreted frame)
 - org.apache.hadoop.mapred.JobLocalizer$3.run() @bci=20, line=534 (Interpreted frame)
 - org.apache.hadoop.mapred.JobLocalizer$3.run() @bci=1, line=531 (Interpreted frame)
 - java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction, java.security.AccessControlContext)
@bci=0 (Interpreted frame)
 - javax.security.auth.Subject.doAs(javax.security.auth.Subject, java.security.PrivilegedExceptionAction)
@bci=42, line=396 (Interpreted frame)
 - org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction)
@bci=14, line=1082 (Interpreted frame)
 - org.apache.hadoop.mapred.JobLocalizer.main(java.lang.String[]) @bci=266, line=530 (Interpreted
frame)

While all other tasks on the same node were stuck in 
Thread 32141: (state = BLOCKED)
 - java.lang.Thread.sleep(long) @bci=0 (Interpreted frame)
 - org.apache.hadoop.mapred.Task.commit(org.apache.hadoop.mapred.TaskUmbilicalProtocol, org.apache.hadoop.mapred.Task$TaskReporter,
org.apache.hadoop.mapreduce.OutputCommitter) @bci=24, line=980 (Compiled frame)
 - org.apache.hadoop.mapred.Task.done(org.apache.hadoop.mapred.TaskUmbilicalProtocol, org.apache.hadoop.mapred.Task$TaskReporter)
@bci=146, line=871 (Interpreted frame)
 - org.apache.hadoop.mapred.ReduceTask.run(org.apache.hadoop.mapred.JobConf, org.apache.hadoop.mapred.TaskUmbilicalProtocol)
@bci=470, line=423 (Interpreted frame)
 - org.apache.hadoop.mapred.Child$4.run() @bci=29, line=255 (Interpreted frame)
 - java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction, java.security.AccessControlContext)
@bci=0 (Interpreted frame)
 - javax.security.auth.Subject.doAs(javax.security.auth.Subject, java.security.PrivilegedExceptionAction)
@bci=42, line=396 (Interpreted frame)
 - org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction)
@bci=14, line=1082 (Interpreted frame)
 - org.apache.hadoop.mapred.Child.main(java.lang.String[]) @bci=738, line=249 (Interpreted
frame)

This should never happen. A stuck task should never prevent other tasks from different jobs
on the same node from committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message