hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harsh J Chouraria (JIRA)" <j...@apache.org>
Subject [jira] Created: (MAPREDUCE-2236) No task may execute due to an Integer overflow possibility
Date Wed, 29 Dec 2010 19:39:56 GMT
No task may execute due to an Integer overflow possibility
----------------------------------------------------------

                 Key: MAPREDUCE-2236
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2236
             Project: Hadoop Map/Reduce
          Issue Type: Bug
    Affects Versions: 0.20.2
         Environment: Linux, Hadoop 0.20.2
            Reporter: Harsh J Chouraria
            Assignee: Harsh J Chouraria
            Priority: Critical
             Fix For: 0.23.0


If the attempts is configured to use Integer.MAX_VALUE, an overflow occurs inside TaskInProgress,
and thereby no task is attempted by the cluster and the map tasks stay in pending state forever.

For example, here's a job driver that causes this:
{code}
import java.io.IOException;

import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.TextInputFormat;
import org.apache.hadoop.mapred.lib.IdentityMapper;
import org.apache.hadoop.mapred.lib.NullOutputFormat;


@SuppressWarnings("deprecation")
public class IntegerOverflow {

	/**
	 * @param args
	 * @throws IOException 
	 */
	@SuppressWarnings("deprecation")
	public static void main(String[] args) throws IOException {
		JobConf conf = new JobConf();
		
		Path inputPath = new Path("ignore");
		FileSystem fs = FileSystem.get(conf);
		if (!fs.exists(inputPath)) {
			FSDataOutputStream out = fs.create(inputPath);
			out.writeChars("Test");
			out.close();
		}
		
		conf.setInputFormat(TextInputFormat.class);
		conf.setOutputFormat(NullOutputFormat.class);
		FileInputFormat.addInputPath(conf, inputPath);
		
		conf.setMapperClass(IdentityMapper.class);
		conf.setNumMapTasks(1);
		// Problem inducing line follows.
		conf.setMaxMapAttempts(Integer.MAX_VALUE);
		
		// No reducer in this test, although setMaxReduceAttempts leads to the same problem.
		conf.setNumReduceTasks(0);
		
		JobClient.runJob(conf);
	}

}
{code}

The above code will not let any map task run. Additionally, a log would be created inside
JobTracker logs with the following information that clearly shows the overflow:
{code}
2010-12-30 00:59:07,836 WARN org.apache.hadoop.mapred.TaskInProgress: Exceeded limit of -2147483648
(plus 0 killed) attempts for the tip 'task_201012300058_0001_m_000000'
{code}

The issue lies inside the TaskInProgress class (/o/a/h/mapred/TaskInProgress.java), at line
1018 (trunk), part of the getTaskToRun(String taskTracker) method.
{code}
  public Task getTaskToRun(String taskTracker) throws IOException {   
    // Create the 'taskid'; do not count the 'killed' tasks against the job!
    TaskAttemptID taskid = null;
    /* ============ THIS LINE v ====================================== */
    if (nextTaskId < (MAX_TASK_EXECS + maxTaskAttempts + numKilledTasks)) {
    /* ============ THIS LINE ^====================================== */
      // Make sure that the attempts are unqiue across restarts
      int attemptId = job.getNumRestarts() * NUM_ATTEMPTS_PER_RESTART + nextTaskId;
      taskid = new TaskAttemptID( id, attemptId);
      ++nextTaskId;
    } else {
      LOG.warn("Exceeded limit of " + (MAX_TASK_EXECS + maxTaskAttempts) +
              " (plus " + numKilledTasks + " killed)"  + 
              " attempts for the tip '" + getTIPId() + "'");
      return null;
    }
{code}

Since all three variables being added are integer in type, one of them being Integer.MAX_VALUE
makes the condition fail with an overflow, thereby logging and returning a null as the result
is negative.

One solution would be to make one of these variables into a long, so the addition does not
overflow?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message