hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Update of "HowToDebugMapReducePrograms" by Amareshwari
Date Fri, 28 Sep 2007 07:15:36 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by Amareshwari:
http://wiki.apache.org/lucene-hadoop/HowToDebugMapReducePrograms

------------------------------------------------------------------------------
  
  == Run a debug script when Task fails ==
  
- A facility is provided, via user-provided scripts, for doing post-processing on task logs,
task's stdout, stderr, core file.There is a default script which processes core dumps under
gdb and prints stack trace. The last five lines from stdout and stderr of debug script are
printed on the diagnostics. These outputs are displayed job UI on demand. 
+ A facility is provided, via user-provided scripts, for doing post-processing on task logs,
task's stdout, stderr, core file. There is a default script which processes core dumps under
gdb and prints stack trace. The last five lines from stdout and stderr of debug script are
printed on the diagnostics. These outputs are displayed job UI on demand. 
  
  == How to submit debug command ==
  
- A very quick and easy way to set debug command is to set the properties mapred.map.task.debug.command
and mapred.reduce.task.debug.command for debugging map task and reduce task respectively.
+ A quick way to set debug command is to set the properties "mapred.map.task.debug.command"
and "mapred.reduce.task.debug.command" for debugging map task and reduce task respectively.
  These properties can also be set by APIs conf.setMapDebugCommand(String cmd) and conf.setReduceDebugCommand(String
cmd).
- The command can consists of @stdout@, @stderr@, @core@ to access task's stdout, stderr and
core files respectively.
+ The debug command can consist of @stdout@, @stderr@, @core@ to access task's stdout, stderr
and core files respectively.
+ In case of streaming debug command can be submitted with command-line options -mapdebug,
-reducedebug for debugging mapper and redcuer respectively.
+ 
+ For examples command can be 'myScript @stderr@'. This command has executable myScript. And
myScript processes failed task's stderr.
+ 
+ The debug command can be a gdb command where user can submit a command file to execute using
-x. 
+ Then debug command can look like 'gdb <program-name> -c @core@ -x <cmd-fle>
'. This command processes core file of the failed task <program-name> and executes commands
in <cmd-file>
  
  == How to submit debug script ==
  
+ To submit the debug script file, first put the file in dfs.
+ 
+ Set the property "mapred.cache.executables" with value <path>#<executable-name>.
Executable property can also be set by APIs DistributedCache.addCacheExecutable(URI,conf)
and DistributedCache.setCacheExecutables(URI[],conf) where URI is of the form "hdfs://host:port/<path>#<executable-name>".
For Streaming executable can be added through -cacheExecutable URI.
+ 
+ For gdb, command file need not be executable. The command file needs to be in dfs. It can
be added to cache by setting the property "mapred.cache.files" with the value <path>#<cmd-file>
or through the API DistribuedCache.addCacheFile(URI,conf).
+ Please make sure the property "mapred.create.symlink" is set to "yes"
  
  = How to debug Hadoop Pipes programs =
  

Mime
View raw message