hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yuri Pradkin <y...@isi.edu>
Subject IsolationRunner [was Re: extracting input to a task from a (streaming) job?]
Date Wed, 27 Aug 2008 15:08:19 GMT
I posted this a while back and have been wondering whether I missed something 
and the doc is out of date or this is a bug and I should file a jira.  Is 
there anyone out there who is successfully using IsolationRunner?  Please let 
me know.

Thanks,

  -Yuri

On Friday 08 August 2008 10:09:48 Yuri Pradkin wrote:

> > >I believe you should set "keep.failed.tasks.files" to true -- this way,
> > > give a task id, you can see what input files it has in ~/
> > >taskTracker/${taskid}/work (source:
> > >http://hadoop.apache.org/core/docs/r0.17.0/mapred_tutorial.html#Isolatio
> > >nR unner )

I forgot to add: I set
  <name>keep.failed.task.files</name>
  <value>true</value>

Note that doc calls it keep.failed.tasks.files (tasks plural) which doesn't 
match the code.

>
> IsolationRunner does not work as described in the tutorial.  After the task
> hung, I failed it via the web interface.  Then I went to the node that was
> running this task
>
>   $ cd ...local/taskTracker/jobcache/job_200808071645_0001/work
> (this path is already different from the tutorial's)
>
>   $ hadoop org.apache.hadoop.mapred.IsolationRunner ../job.xml
> Exception in thread "main" java.lang.NullPointerException
>         at
> org.apache.hadoop.mapred.IsolationRunner.main(IsolationRunner.java:164)
>
> Looking at IsolationRunner code, I see this:
>
>     164     File workDirName = new File(lDirAlloc.getLocalPathToRead(
>     165                                   TaskTracker.getJobCacheSubdir()
>     166                                   + Path.SEPARATOR +
> taskId.getJobID() 167                                   + Path.SEPARATOR +
> taskId 168                                   + Path.SEPARATOR + "work", 169
>                                   conf). toString());
>
> I.e. it assumes there is supposed to be a taskID subdirectory under the job
> dir, but:
>  $ pwd
>  ...mapred/local/taskTracker/jobcache/job_200808071645_0001
>  $ ls
>  jars  job.xml  work
>
> -- it's not there.  Any suggestions?
>
> Thanks,
>
>   -Yuri



Mime
View raw message