hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Update of "HowToDebugMapReducePrograms" by TedDunning
Date Tue, 28 Aug 2007 05:58:43 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by TedDunning:
http://wiki.apache.org/lucene-hadoop/HowToDebugMapReducePrograms

The comment on the change is:
added help in setting config parameters.

------------------------------------------------------------------------------
  
   1. Start by getting everything running (likely on a small input) in the local runner. 
      You do this by setting your job tracker to "local" in your config. The local runner
can run 
-     under the debugger and runs on your development machine.
+     under the debugger and runs on your development machine.  A very quick and easy way
to set this 
+     config variable is to include the following line just before you run the job:
+ 
+     {{{conf.set("mapred.job.tracker", "local");}}}
+ 
+     You may also want to do this to make the input and output files be in the local file
system rather than in the Hadoop 
+     distributed file system (HDFS):
+ 
+     {{{conf.set("fs.default.name", "local");}}}
+ 
+     You can also set these configuration parameters in {{{hadoop-site.xml}}}.  The configuration
files 
+     {{{hadoop-default.xml}}}, {{{mapred-default.xml}}} and {{{hadoop-site.xml}}} should
appear somewhere in your program's 
+     class path when the program runs.
+ 
  
   2. Run the small input on a 1 node cluster. This will smoke out all of the issues that
happen with
      distribution and the "real" task runner, but you only have a single place to look at
logs. Most 

Mime
View raw message