Return-Path: Delivered-To: apmail-lucene-hadoop-commits-archive@locus.apache.org Received: (qmail 73834 invoked from network); 26 Oct 2007 09:00:50 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 26 Oct 2007 09:00:50 -0000 Received: (qmail 65525 invoked by uid 500); 26 Oct 2007 09:00:37 -0000 Delivered-To: apmail-lucene-hadoop-commits-archive@lucene.apache.org Received: (qmail 65497 invoked by uid 500); 26 Oct 2007 09:00:37 -0000 Mailing-List: contact hadoop-commits-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-commits@lucene.apache.org Received: (qmail 65488 invoked by uid 99); 26 Oct 2007 09:00:37 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 26 Oct 2007 02:00:37 -0700 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.130] (HELO eos.apache.org) (140.211.11.130) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 26 Oct 2007 09:00:49 +0000 Received: from eos.apache.org (localhost [127.0.0.1]) by eos.apache.org (Postfix) with ESMTP id 0AA575A250 for ; Fri, 26 Oct 2007 09:00:29 +0000 (GMT) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Apache Wiki To: hadoop-commits@lucene.apache.org Date: Fri, 26 Oct 2007 09:00:28 -0000 Message-ID: <20071026090028.11474.17169@eos.apache.org> Subject: [Lucene-hadoop Wiki] Update of "HowToDebugMapReducePrograms" by Amareshwari X-Virus-Checked: Checked by ClamAV on apache.org Dear Wiki user, You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification. The following page has been changed by Amareshwari: http://wiki.apache.org/lucene-hadoop/HowToDebugMapReducePrograms ------------------------------------------------------------------------------ }}} and run your executable under the debugger or valgrind. It will run as if the framework was feeding it commands and data and produce a output file downlink.data.out with the binary commands that it would have sent up to the framework. Eventually, I'll probably make the downlink.data.out file into a text-based format, but for now it is binary. Most problems however, will be pretty clear in the debugger or valgrind, even without looking at the generated data. - = The following sections are applicable only for Hadoop 0.15.0 and above = + = The following sections are applicable only for Hadoop 0.16.0 and above = = Run a debug script when Task fails = @@ -83, +83 @@ == How to submit debug script == - A quick way to set debug script is to set the properties "mapred.map.task.debug.script" and "mapred.reduce.task.debug.script" for debugging map task and reduce task respectively. These properties can also be set by APIs conf.setMapDebugScript(String script) and conf.setReduceDebugScript(String script). + A quick way to set debug script is to set the properties "mapred.map.task.debug.script" and "mapred.reduce.task.debug.script" for debugging map task and reduce task respectively. These properties can also be set by APIs JobConf.setMapDebugScript and JobConf.setReduceDebugScript. - The debug command is run as $script $stdout $stderr $syslog $jobconf. Task's stdout, stderr, syslog and jobconf files can be accessed inside the script as $1, $2, $3 and $4. In case of streaming, debug script can be submitted with command-line options -mapdebug, -reducedebug for debugging mapper and redcuer respectively. - To submit the debug script file, first put the file in dfs. - Make sure the property "mapred.create.symlink" is set to "yes". This can also be set by [http://lucene.apache.org/hadoop/api/org/apache/hadoop/filecache/DistributedCache.html#createSymlink(org.apache.hadoop.conf.Configuration) DistributedCache.createSymLink] + The script is given task's stdout, stderr, syslog, jobconf files as arguments. + The debug command, run on the node where the map/reduce failed, is: + {{{ $script $stdout $stderr $syslog $jobconf }}} + + For streaming, debug script can be submitted with command-line options -mapdebug, -reducedebug for debugging mapper and reducer respectively. + + Pipes programs have the c++ program name as a fifth argument for the command. Thus for the pipes programs the command is + + {{{ $script $stdout $stderr $syslog $jobconf $program }}} + + + To submit the debug script file, first put the file in dfs. + - The file can be added by setting the property "mapred.cache.files" with value #. For more than one file, they can be added as comma seperated paths. + The file can be distributed by setting the property "mapred.cache.files" with value #. For more than one file, they can be added as comma seperated paths. + The script file needs to be symlinked. + This property can also be set by APIs [http://lucene.apache.org/hadoop/api/org/apache/hadoop/filecache/DistributedCache.html#addCacheFile(java.net.URI,%20org.apache.hadoop.conf.Configuration) DistributedCache.addCacheFile(URI,conf)] and [http://lucene.apache.org/hadoop/api/org/apache/hadoop/filecache/DistributedCache.html#setCacheFiles DistributedCache.setCacheFiles(URIs,conf)] where URI is of the form "hdfs://host:port/#". For Streaming, the file can be added through command line option -cacheFile. + To create symlink for the file, the property "mapred.create.symlink" is set to "yes". This can also be set by [http://lucene.apache.org/hadoop/api/org/apache/hadoop/filecache/DistributedCache.html#createSymlink(org.apache.hadoop.conf.Configuration) DistributedCache.createSymLink] + + Here is an example on how to submit a script + {{{ + jobConf.setMapDebugScript("./myscript"); + DistributedCache.createSymlink(jobConf); + DistributedCache.addCacheFile("/debug/scripts/myscript#myscript"); + }}} == Default Behavior == @@ -101, +121 @@ For Pipes: Stdout, stderr are shown on the job UI. - Default gdb script is run which prints info abt threads: thread Id and function in which it was running when task failed. + If the failed task has core file, Default gdb script is run which prints info abt threads: thread Id and function in which it was running when task failed. - And prints stack tarce where task has failed. + And prints stack trace where task has failed. For Streaming: Stdout, stderr are shown on the Job UI.