hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@archive.org>
Subject Re: Hung job
Date Mon, 13 Mar 2006 23:43:51 GMT
Doug Cutting wrote:
> stack wrote:
>> ...
>>
>> Somehow the reduce needs to give up and the jobtracker needs to rerun 
>> the map just as it would if the tasktracker had died completely.
> 
> Perhaps what should happen is that the TaskTracker should exit when it 
> encounters errors reading map output.....
> 
> I've attached a patch.  The TaskTracker will restart, but with a new id, 
> so all of its tasks will be considered lost.  This will unfortunately 
> lose other map tasks done by this tasktracker, but at least things will 
> keep going.
> 
> Does this look right to you?
> 

Yes. Sounds like right thing to do. Minor comments in the below. 
Meantime, let me try it.
Thanks,
St.Ack


> Doug
> 
> 
...

>  
>          return 0;
> Index: src/java/org/apache/hadoop/mapred/MapOutputFile.java
> ===================================================================
> --- src/java/org/apache/hadoop/mapred/MapOutputFile.java	(revision 385629)
> +++ src/java/org/apache/hadoop/mapred/MapOutputFile.java	(working copy)
> @@ -17,6 +17,7 @@
>  package org.apache.hadoop.mapred;
>  
>  import java.io.IOException;
> +import java.util.logging.Level;
>  
>  import java.io.*;
>  import org.apache.hadoop.io.*;
> @@ -108,12 +109,26 @@
>      // write the length-prefixed file content to the wire
>      File file = getOutputFile(mapTaskId, partition);
>      out.writeLong(file.length());
> -    FSDataInputStream in = FileSystem.getNamed("local", this.jobConf).open(file);
> +
> +    FSDataInputStream in = null;
>      try {
> +      in = FileSystem.getNamed("local", this.jobConf).open(file);
> +    } catch (IOException e) {
> +      // log a SEVERE exception in order to cause TaskTracker to exit
> +      TaskTracker.LOG.log(Level.SEVERE, "Can't open map output:" + file, e);
> +

Should there be a 'throw e;' after TaskTracker.LOG.log above?



     }
> +    try {
>        byte[] buffer = new byte[8192];
> -      int l;
> -      while ((l = in.read(buffer)) != -1) {
> +      int l  = 0;
> +      
> +      while (l != -1) {
>          out.write(buffer, 0, l);
> +        try {
> +          l = in.read(buffer);
> +        } catch (IOException e) {
> +          // log a SEVERE exception in order to cause TaskTracker to exit
> +          TaskTracker.LOG.log(Level.SEVERE,"Can't read map output:" + file, e);


And same here.


> +        }
>        }
>      } finally {
>        in.close();


Mime
View raw message