hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Update of "HowToDebugMapReducePrograms" by Amareshwari
Date Fri, 28 Sep 2007 06:43:47 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by Amareshwari:

  This can be extremely useful to display debug information about the current record being
handled, or setting certain debug flags about the status of the mapper. While running locally
on a small data set can find many bugs, large data sets may contain pathological cases that
are otherwise unexepcted. This method of debugging can help catch those cases.
+ == Run a debug script when Task fails ==
+ A facility is provided, via user-provided scripts, for doing post-processing on task logs,
task's stdout, stderr, core file.There is a default script which processes core dumps under
gdb and prints stack trace. The last five lines from stdout and stderr of debug script are
printed on the diagnostics. These outputs are displayed job UI on demand. 
+ == How to submit debug command ==
+ A very quick and easy way to set debug command is to set the properties mapred.map.task.debug.command
and mapred.reduce.task.debug.command for debugging map task and reduce task respectively.
+ These properties can also be set by APIs conf.setMapDebugCommand(String cmd) and conf.setReduceDebugCommand(String
+ The command can consists of @stdout@, @stderr@, @core@ to access task's stdout, stderr and
core files respectively.
+ == How to submit debug script ==
  = How to debug Hadoop Pipes programs =
  In order to debug Pipes programs you need to keep the downloaded commands. 

View raw message