hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yuri Pradkin <y...@isi.edu>
Subject Re: extracting input to a task from a (streaming) job?
Date Fri, 08 Aug 2008 17:09:48 GMT
On Thursday 07 August 2008 16:43:10 John Heidemann wrote:
> On Thu, 07 Aug 2008 19:42:05 +0200, "Leon Mergen" wrote:
> >Hello John,
> >
> >On Thu, Aug 7, 2008 at 6:30 PM, John Heidemann <johnh@isi.edu> wrote:
> >> I have a large Hadoop streaming job that generally works fine,
> >> but a few (2-4) of the ~3000 maps and reduces have problems.
> >> To make matters worse, the problems are system-dependent (we run an a
> >> cluster with machines of slightly different OS versions).
> >> I'd of course like to debug these problems, but they are embedded in a
> >> large job.
> >>
> >> Is there a way to extract the input given to a reducer from a job, given
> >> the task identity?  (This would also be helpful for mappers.)
> >
> >I believe you should set "keep.failed.tasks.files" to true -- this way,
> > give a task id, you can see what input files it has in ~/
> >taskTracker/${taskid}/work (source:
> >http://hadoop.apache.org/core/docs/r0.17.0/mapred_tutorial.html#IsolationR
> >unner )

IsolationRunner does not work as described in the tutorial.  After the task hung, I failed
it 
via the web interface.  Then I went to the node that was running this task

  $ cd ...local/taskTracker/jobcache/job_200808071645_0001/work
(this path is already different from the tutorial's)

  $ hadoop org.apache.hadoop.mapred.IsolationRunner ../job.xml
Exception in thread "main" java.lang.NullPointerException
        at org.apache.hadoop.mapred.IsolationRunner.main(IsolationRunner.java:164)

Looking at IsolationRunner code, I see this:

    164     File workDirName = new File(lDirAlloc.getLocalPathToRead(
    165                                   TaskTracker.getJobCacheSubdir()
    166                                   + Path.SEPARATOR + taskId.getJobID()
    167                                   + Path.SEPARATOR + taskId
    168                                   + Path.SEPARATOR + "work",
    169                                   conf). toString());

I.e. it assumes there is supposed to be a taskID subdirectory under the job
dir, but:
 $ pwd
 ...mapred/local/taskTracker/jobcache/job_200808071645_0001
 $ ls
 jars  job.xml  work

-- it's not there.  Any suggestions?

Thanks,

  -Yuri



Mime
View raw message