hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-6002) MR task should prevent report error to AM when process is shutting down
Date Thu, 24 Jul 2014 02:49:39 GMT
Wangda Tan created MAPREDUCE-6002:
-------------------------------------

             Summary: MR task should prevent report error to AM when process is shutting down
                 Key: MAPREDUCE-6002
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6002
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: task
    Affects Versions: 2.5.0
            Reporter: Wangda Tan
            Assignee: Wangda Tan


With MAPREDUCE-5900, preempted MR task should not be treat as failed. 
But it is still possible a MR task fail and report to AM when preemption take effect and the
AM hasn't received completed container from RM yet. It will cause the task attempt marked
failed instead of preempted.

An example is FileSystem has shutdown hook, it will close all FileSystem instance, if at the
same time, the FileSystem is in-use (like reading split details from HDFS), MR task will fail
and report the fatal error to MR AM. An exception will be raised:
{code}
2014-07-22 01:46:19,613 FATAL [IPC Server handler 10 on 56903] org.apache.hadoop.mapred.TaskAttemptListenerImpl:
Task: attempt_1405985051088_0018_m_000025_0 - exited : java.io.IOException: Filesystem closed
	at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:707)
	at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:776)
	at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:837)
	at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:645)
	at java.io.DataInputStream.readByte(DataInputStream.java:265)
	at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
	at org.apache.hadoop.io.WritableUtils.readVIntInRange(WritableUtils.java:348)
	at org.apache.hadoop.io.Text.readString(Text.java:464)
	at org.apache.hadoop.io.Text.readString(Text.java:457)
	at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:357)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
{code}

We should prevent this, because it is possible other exceptions happen when shutting down,
we shouldn't report any of such exceptions to AM.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message