hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From d...@apache.org
Subject svn commit: r643793 [1/3] - in /hadoop/core/trunk: CHANGES.txt docs/changes.html docs/mapred_tutorial.html docs/mapred_tutorial.pdf src/docs/src/documentation/content/xdocs/mapred_tutorial.xml src/docs/src/documentation/content/xdocs/site.xml
Date Wed, 02 Apr 2008 08:42:49 GMT
Author: ddas
Date: Wed Apr  2 01:42:43 2008
New Revision: 643793

URL: http://svn.apache.org/viewvc?rev=643793&view=rev
Log:
HADOOP-3106. Adds documentation in forrest for debugging. Contributed by Amareshwari Sriramadasu.

Modified:
    hadoop/core/trunk/CHANGES.txt
    hadoop/core/trunk/docs/changes.html
    hadoop/core/trunk/docs/mapred_tutorial.html
    hadoop/core/trunk/docs/mapred_tutorial.pdf
    hadoop/core/trunk/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml
    hadoop/core/trunk/src/docs/src/documentation/content/xdocs/site.xml

Modified: hadoop/core/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/hadoop/core/trunk/CHANGES.txt?rev=643793&r1=643792&r2=643793&view=diff
==============================================================================
--- hadoop/core/trunk/CHANGES.txt (original)
+++ hadoop/core/trunk/CHANGES.txt Wed Apr  2 01:42:43 2008
@@ -176,6 +176,9 @@
     HADOOP-3093. Adds Configuration.getStrings(name, default-value) and
     the corresponding setStrings. (Amareshwari Sriramadasu via ddas)
 
+    HADOOP-3106. Adds documentation in forrest for debugging.
+    (Amareshwari Sriramadasu via ddas)
+
   OPTIMIZATIONS
 
     HADOOP-2790.  Fixed inefficient method hasSpeculativeTask by removing

Modified: hadoop/core/trunk/docs/changes.html
URL: http://svn.apache.org/viewvc/hadoop/core/trunk/docs/changes.html?rev=643793&r1=643792&r2=643793&view=diff
==============================================================================
--- hadoop/core/trunk/docs/changes.html (original)
+++ hadoop/core/trunk/docs/changes.html Wed Apr  2 01:42:43 2008
@@ -36,7 +36,7 @@
     function collapse() {
       for (var i = 0; i < document.getElementsByTagName("ul").length; i++) {
         var list = document.getElementsByTagName("ul")[i];
-        if (list.id != 'trunk_(unreleased_changes)_' && list.id != 'release_0.16.2_-_unreleased_')
{
+        if (list.id != 'trunk_(unreleased_changes)_' && list.id != 'release_0.16.2_-_2008-04-02_')
{
           list.style.display = "none";
         }
       }
@@ -56,7 +56,7 @@
 </a></h2>
 <ul id="trunk_(unreleased_changes)_">
   <li><a href="javascript:toggleList('trunk_(unreleased_changes)_._incompatible_changes_')">
 INCOMPATIBLE CHANGES
-</a>&nbsp;&nbsp;&nbsp;(11)
+</a>&nbsp;&nbsp;&nbsp;(19)
     <ol id="trunk_(unreleased_changes)_._incompatible_changes_">
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2786">HADOOP-2786</a>.
 Move hbase out of hadoop core
 </li>
@@ -77,13 +77,28 @@
 and isDir(String) from ClientProtocol. ClientProtocol version changed
 from 26 to 27. (Tsz Wo (Nicholas), SZE via cdouglas)
 </li>
-      <li><a href="http://issues.apache.org/jira/browse/HADOOP-2822">HADOOP-2822</a>.
Remove depreceted code for classes InputFormatBase and
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-2822">HADOOP-2822</a>.
Remove deprecated code for classes InputFormatBase and
 PhasedFileSystem.<br />(Amareshwari Sriramadasu via enis)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2116">HADOOP-2116</a>.
Changes the layout of the task execution directory.<br />(Amareshwari Sriramadasu via
ddas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-2828">HADOOP-2828</a>.
The following deprecated methods in Configuration.java
+have been removed
+    getObject(String name)
+    setObject(String name, Object value)
+    get(String name, Object defaultValue)
+    set(String name, Object value)
+    Iterator entries()<br />(Amareshwari Sriramadasu via ddas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-2824">HADOOP-2824</a>.
Removes one deprecated constructor from MiniMRCluster.<br />(Amareshwari Sriramadasu
via ddas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-2823">HADOOP-2823</a>.
Removes deprecated methods getColumn(), getLine() from
+org.apache.hadoop.record.compiler.generated.SimpleCharStream.<br />(Amareshwari Sriramadasu
via ddas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3060">HADOOP-3060</a>.
Removes one unused constructor argument from MiniMRCluster.<br />(Amareshwari Sriramadasu
via ddas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-2854">HADOOP-2854</a>.
Remove deprecated o.a.h.ipc.Server::getUserInfo().<br />(lohit vijayarenu via cdouglas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-2563">HADOOP-2563</a>.
Remove deprecated FileSystem::listPaths.<br />(lohit vijayarenu via cdouglas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-2818">HADOOP-2818</a>.
 Remove deprecated methods in Counters.<br />(Amareshwari Sriramadasu via tomwhite)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-2831">HADOOP-2831</a>.
Remove deprecated o.a.h.dfs.INode::getAbsoluteName()<br />(lohit vijayarenu via cdouglas)</li>
     </ol>
   </li>
   <li><a href="javascript:toggleList('trunk_(unreleased_changes)_._new_features_')">
 NEW FEATURES
-</a>&nbsp;&nbsp;&nbsp;(7)
+</a>&nbsp;&nbsp;&nbsp;(9)
     <ol id="trunk_(unreleased_changes)_._new_features_">
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-1398">HADOOP-1398</a>.
 Add HBase in-memory block cache.<br />(tomwhite)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2178">HADOOP-2178</a>.
 Job History on DFS.<br />(Amareshwari Sri Ramadasu via ddas)</li>
@@ -99,10 +114,12 @@
 DFSClient and DataNode sockets have 10min write timeout.<br />(rangadi)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2951">HADOOP-2951</a>.
 Add a contrib module that provides a utility to
 build or update Lucene indexes using Map/Reduce.<br />(Ning Li via cutting)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-1622">HADOOP-1622</a>.
 Allow multiple jar files for map reduce.<br />(Mahadev Konar via dhruba)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-2055">HADOOP-2055</a>.
Allows users to set PathFilter on the FileInputFormat.<br />(Alejandro Abdelnur via
ddas)</li>
     </ol>
   </li>
   <li><a href="javascript:toggleList('trunk_(unreleased_changes)_._improvements_')">
 IMPROVEMENTS
-</a>&nbsp;&nbsp;&nbsp;(22)
+</a>&nbsp;&nbsp;&nbsp;(26)
     <ol id="trunk_(unreleased_changes)_._improvements_">
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2655">HADOOP-2655</a>.
Copy on write for data and metadata files in the
 presence of snapshots. Needed for supporting appends to HDFS
@@ -114,9 +131,6 @@
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2895">HADOOP-2895</a>.
Let the profiling string be configurable.<br />(Martin Traverso via cdouglas)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-910">HADOOP-910</a>.
Enables Reduces to do merges for the on-disk map output files
 in parallel with their copying.<br />(Amar Kamat via ddas)</li>
-      <li><a href="http://issues.apache.org/jira/browse/HADOOP-2833">HADOOP-2833</a>.
Do not use "Dr. Who" as the default user in JobClient.
-A valid user name is required. (Tsz Wo (Nicholas), SZE via rangadi)
-</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-730">HADOOP-730</a>.
Use rename rather than copy for local renames.<br />(cdouglas)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2810">HADOOP-2810</a>.
Updated the Hadoop Core logo.<br />(nigel)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2057">HADOOP-2057</a>.
 Streaming should optionally treat a non-zero exit status
@@ -138,18 +152,24 @@
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2939">HADOOP-2939</a>.
Make the automated patch testing process an executable
 Ant target, test-patch.<br />(nigel)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2239">HADOOP-2239</a>.
Add HsftpFileSystem to permit transferring files over ssl.<br />(cdouglas)</li>
-      <li><a href="http://issues.apache.org/jira/browse/HADOOP-2910">HADOOP-2910</a>.
Throttle IPC Client/Server during bursts of
-requests or server slowdown.<br />(Hairong Kuang via dhruba)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2848">HADOOP-2848</a>.
[HOD]hod -o list and deallocate works even after deleting
 the cluster directory.<br />(Hemanth Yamijala via ddas)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2899">HADOOP-2899</a>.
[HOD] Cleans up hdfs:///mapredsystem directory after
 deallocation.<br />(Hemanth Yamijala via ddas)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2886">HADOOP-2886</a>.
 Track individual RPC metrics.<br />(girish vaitheeswaran via dhruba)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2373">HADOOP-2373</a>.
Improvement in safe-mode reporting.<br />(shv)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-2796">HADOOP-2796</a>.
Enables distinguishing exit codes from user code vis-a-vis
+HOD's exit code.<br />(Hemanth Yamijala via ddas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3091">HADOOP-3091</a>.
Modify FsShell command -put to accept multiple sources.<br />(Lohit Vijaya Renu via
cdouglas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3092">HADOOP-3092</a>.
Show counter values from job -status command.<br />(Tom White via ddas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-1228">HADOOP-1228</a>.
 Ant task to generate Eclipse project files.<br />(tomwhite)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3093">HADOOP-3093</a>.
Adds Configuration.getStrings(name, default-value) and
+the corresponding setStrings.<br />(Amareshwari Sriramadasu via ddas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3106">HADOOP-3106</a>.
Adds documentation in forrest for debugging.<br />(Amareshwari Sriramadasu via ddas)</li>
     </ol>
   </li>
   <li><a href="javascript:toggleList('trunk_(unreleased_changes)_._optimizations_')">
 OPTIMIZATIONS
-</a>&nbsp;&nbsp;&nbsp;(7)
+</a>&nbsp;&nbsp;&nbsp;(10)
     <ol id="trunk_(unreleased_changes)_._optimizations_">
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2790">HADOOP-2790</a>.
 Fixed inefficient method hasSpeculativeTask by removing
 repetitive calls to get the current time and late checking to see if
@@ -168,10 +188,20 @@
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2148">HADOOP-2148</a>.
Eliminate redundant data-node blockMap lookups.<br />(shv)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2027">HADOOP-2027</a>.
Return the number of bytes in each block in a file
 via a single rpc to the namenode to speed up job planning.<br />(Lohit Vijaya Renu
via omalley)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-2902">HADOOP-2902</a>.
 Replace uses of "fs.default.name" with calls to the
+accessor methods added in <a href="http://issues.apache.org/jira/browse/HADOOP-1967">HADOOP-1967</a>.<br
/>(cutting)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-2119">HADOOP-2119</a>.
 Optimize scheduling of jobs with large numbers of
+tasks by replacing static arrays with lists of runnable tasks.<br />(Amar Kamat via
omalley)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-2919">HADOOP-2919</a>.
 Reduce the number of memory copies done during the
+map output sorting. Also adds two config variables:
+io.sort.spill.percent - the percentages of io.sort.mb that should
+                        cause a spill (default 80%)
+io.sort.record.percent - the percent of io.sort.mb that should
+                         hold key/value indexes (default 5%)<br />(cdouglas via omalley)</li>
     </ol>
   </li>
   <li><a href="javascript:toggleList('trunk_(unreleased_changes)_._bug_fixes_')">
 BUG FIXES
-</a>&nbsp;&nbsp;&nbsp;(53)
+</a>&nbsp;&nbsp;&nbsp;(68)
     <ol id="trunk_(unreleased_changes)_._bug_fixes_">
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2195">HADOOP-2195</a>.
'-mkdir' behaviour is now closer to Linux shell in case of
 errors.<br />(Mahadev Konar via rangadi)</li>
@@ -275,15 +305,37 @@
 the recursive flag.<br />(Mahadev Konar via dhruba)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3012">HADOOP-3012</a>.
dfs -mv file to user home directory throws exception if
 the user home directory does not exist.<br />(Mahadev Konar via dhruba)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3066">HADOOP-3066</a>.
Should not require superuser privilege to query if hdfs is in
+safe mode<br />(jimk)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3040">HADOOP-3040</a>.
If the input line starts with the separator char, the key
+is set as empty.<br />(Amareshwari Sriramadasu via ddas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3080">HADOOP-3080</a>.
Removes flush calls from JobHistory.<br />(Amareshwari Sriramadasu via ddas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3086">HADOOP-3086</a>.
Adds the testcase missed during commit of hadoop-3040.<br />(Amareshwari Sriramadasu
via ddas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-2983">HADOOP-2983</a>.
[HOD] Fixes the problem - local_fqdn() returns None when
+gethostbyname_ex doesnt return any FQDNs.<br />(Craig Macdonald via ddas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3046">HADOOP-3046</a>.
Fix the raw comparators for Text and BytesWritables
+to use the provided length rather than recompute it.<br />(omalley)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3094">HADOOP-3094</a>.
Fix BytesWritable.toString to avoid extending the sign bit<br />(Owen O'Malley via cdouglas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3067">HADOOP-3067</a>.
DFSInputStream's position read does not close the sockets.<br />(rangadi)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3073">HADOOP-3073</a>.
close() on SocketInputStream or SocketOutputStream should
+close the underlying channel.<br />(rangadi)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3087">HADOOP-3087</a>.
Fixes a problem to do with refreshing of loadHistory.jsp.<br />(Amareshwari Sriramadasu
via ddas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-2982">HADOOP-2982</a>.
Fixes a problem in the way HOD looks for free nodes.<br />(Hemanth Yamijala via ddas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3065">HADOOP-3065</a>.
Better logging message if the rack location of a datanode
+cannot be determined.<br />(Devaraj Das via dhruba)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3064">HADOOP-3064</a>.
Commas in a file path should not be treated as delimiters.<br />(Hairong Kuang via shv)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-2997">HADOOP-2997</a>.
Adds test for non-writable serialier. Also fixes a problem
+introduced by <a href="http://issues.apache.org/jira/browse/HADOOP-2399">HADOOP-2399</a>.<br
/>(Tom White via ddas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3114">HADOOP-3114</a>.
Fix TestDFSShell on Windows.<br />(Lohit Vijaya Renu via cdouglas)</li>
     </ol>
   </li>
 </ul>
-<h2><a href="javascript:toggleList('release_0.16.2_-_unreleased_')">Release 0.16.2
- Unreleased
+<h2><a href="javascript:toggleList('release_0.16.2_-_2008-04-02_')">Release 0.16.2
- 2008-04-02
 </a></h2>
-<ul id="release_0.16.2_-_unreleased_">
-  <li><a href="javascript:toggleList('release_0.16.2_-_unreleased_._bug_fixes_')">
 BUG FIXES
-</a>&nbsp;&nbsp;&nbsp;(6)
-    <ol id="release_0.16.2_-_unreleased_._bug_fixes_">
+<ul id="release_0.16.2_-_2008-04-02_">
+  <li><a href="javascript:toggleList('release_0.16.2_-_2008-04-02_._bug_fixes_')">
 BUG FIXES
+</a>&nbsp;&nbsp;&nbsp;(19)
+    <ol id="release_0.16.2_-_2008-04-02_._bug_fixes_">
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3011">HADOOP-3011</a>.
Prohibit distcp from overwriting directories on the
 destination filesystem with files.<br />(cdouglas)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3033">HADOOP-3033</a>.
The BlockReceiver thread in the datanode writes data to
@@ -297,6 +349,34 @@
 </li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3042">HADOOP-3042</a>.
Updates the Javadoc in JobConf.getOutputPath to reflect
 the actual temporary path.<br />(Amareshwari Sriramadasu via ddas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3007">HADOOP-3007</a>.
Tolerate mirror failures while DataNode is replicating
+blocks as it used to before.<br />(rangadi)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-2944">HADOOP-2944</a>.
Fixes a "Run on Hadoop" wizard NPE when creating a
+Location from the wizard.<br />(taton)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3049">HADOOP-3049</a>.
Fixes a problem in MultiThreadedMapRunner to do with
+catching RuntimeExceptions.<br />(Alejandro Abdelnur via ddas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3039">HADOOP-3039</a>.
Fixes a problem to do with exceptions in tasks not
+killing jobs.<br />(Amareshwari Sriramadasu via ddas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3027">HADOOP-3027</a>.
Fixes a problem to do with adding a shutdown hook in
+FileSystem.<br />(Amareshwari Sriramadasu via ddas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3056">HADOOP-3056</a>.
Fix distcp when the target is an empty directory by
+making sure the directory is created first.<br />(cdouglas and acmurthy
+via omalley)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3070">HADOOP-3070</a>.
Protect the trash emptier thread from null pointer
+exceptions.<br />(Koji Noguchi via omalley)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3084">HADOOP-3084</a>.
Fix HftpFileSystem to work for zero-lenghth files.<br />(cdouglas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3107">HADOOP-3107</a>.
Fix NPE when fsck invokes getListings.<br />(dhruba)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3103">HADOOP-3103</a>.
[HOD] Hadoop.tmp.dir should not be set to cluster
+directory. (Vinod Kumar Vavilapalli via ddas).
+</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3104">HADOOP-3104</a>.
Limit MultithreadedMapRunner to have a fixed length queue
+between the RecordReader and the map threads.<br />(Alejandro Abdelnur via
+omalley)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-2833">HADOOP-2833</a>.
Do not use "Dr. Who" as the default user in JobClient.
+A valid user name is required. (Tsz Wo (Nicholas), SZE via rangadi)
+</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3128">HADOOP-3128</a>.
Throw RemoteException in setPermissions and setOwner of
+DistributedFileSystem.<br />(shv via nigel)</li>
     </ol>
   </li>
 </ul>
@@ -319,12 +399,14 @@
     </ol>
   </li>
   <li><a href="javascript:toggleList('release_0.16.1_-_2008-03-13_._improvements_')">
 IMPROVEMENTS
-</a>&nbsp;&nbsp;&nbsp;(3)
+</a>&nbsp;&nbsp;&nbsp;(4)
     <ol id="release_0.16.1_-_2008-03-13_._improvements_">
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2371">HADOOP-2371</a>.
User guide for file permissions in HDFS.<br />(Robert Chansler via rangadi)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2730">HADOOP-2730</a>.
HOD documentation update.<br />(Vinod Kumar Vavilapalli via ddas)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2911">HADOOP-2911</a>.
Make the information printed by the HOD allocate and
 info commands less verbose and clearer.<br />(Vinod Kumar via nigel)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3098">HADOOP-3098</a>.
Allow more characters in user and group names while
+using -chown and -chgrp commands.<br />(rangadi)</li>
     </ol>
   </li>
   <li><a href="javascript:toggleList('release_0.16.1_-_2008-03-13_._bug_fixes_')">
 BUG FIXES
@@ -661,7 +743,7 @@
     </ol>
   </li>
   <li><a href="javascript:toggleList('release_0.16.0_-_2008-02-07_._bug_fixes_')">
 BUG FIXES
-</a>&nbsp;&nbsp;&nbsp;(91)
+</a>&nbsp;&nbsp;&nbsp;(92)
     <ol id="release_0.16.0_-_2008-02-07_._bug_fixes_">
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2583">HADOOP-2583</a>.
 Fixes a bug in the Eclipse plug-in UI to edit locations.
 Plug-in version is now synchronized with Hadoop version.
@@ -863,6 +945,7 @@
 issue.  (Tsz Wo (Nicholas), SZE via dhruba)
 </li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2768">HADOOP-2768</a>.
Fix performance regression caused by <a href="http://issues.apache.org/jira/browse/HADOOP-1707">HADOOP-1707</a>.<br
/>(dhruba borthakur via nigel)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3108">HADOOP-3108</a>.
Fix NPE in setPermission and setOwner.<br />(shv)</li>
     </ol>
   </li>
 </ul>

Modified: hadoop/core/trunk/docs/mapred_tutorial.html
URL: http://svn.apache.org/viewvc/hadoop/core/trunk/docs/mapred_tutorial.html?rev=643793&r1=643792&r2=643793&view=diff
==============================================================================
--- hadoop/core/trunk/docs/mapred_tutorial.html (original)
+++ hadoop/core/trunk/docs/mapred_tutorial.html Wed Apr  2 01:42:43 2008
@@ -276,6 +276,9 @@
 <a href="#IsolationRunner">IsolationRunner</a>
 </li>
 <li>
+<a href="#Debugging">Debugging</a>
+</li>
+<li>
 <a href="#JobControl">JobControl</a>
 </li>
 <li>
@@ -289,7 +292,7 @@
 <a href="#Example%3A+WordCount+v2.0">Example: WordCount v2.0</a>
 <ul class="minitoc">
 <li>
-<a href="#Source+Code-N10C11">Source Code</a>
+<a href="#Source+Code-N10C63">Source Code</a>
 </li>
 <li>
 <a href="#Sample+Runs">Sample Runs</a>
@@ -1857,7 +1860,7 @@
           <em>symlink</em> the cached file(s) into the <span class="codefrag">current
working 
           directory</span> of the task via the 
           <a href="api/org/apache/hadoop/filecache/DistributedCache.html#createSymlink(org.apache.hadoop.conf.Configuration)">
-          DistributedCache.createSymlink(Path, Configuration)</a> api. Files 
+          DistributedCache.createSymlink(Configuration)</a> api. Files 
           have <em>execution permissions</em> set.</p>
 <a name="N10B0A"></a><a name="Tool"></a>
 <h4>Tool</h4>
@@ -1923,13 +1926,75 @@
 <p>
 <span class="codefrag">IsolationRunner</span> will run the failed task in a single

           jvm, which can be in the debugger, over precisely the same input.</p>
-<a name="N10B6F"></a><a name="JobControl"></a>
+<a name="N10B6F"></a><a name="Debugging"></a>
+<h4>Debugging</h4>
+<p>Map/Reduce framework provides a facility to run user-provided 
+          scripts for debugging. When map/reduce task fails, user can run 
+          script for doing post-processing on task logs i.e task's stdout,
+          stderr, syslog and jobconf. The stdout and stderr of the
+          user-provided debug script are printed on the diagnostics. 
+          These outputs are also displayed on job UI on demand. </p>
+<p> In the following sections we discuss how to submit debug script
+          along with the job. For submitting debug script, first it has to
+          distributed. Then the script has to supplied in Configuration. </p>
+<a name="N10B7B"></a><a name="How+to+distribute+script+file%3A"></a>
+<h5> How to distribute script file: </h5>
+<p>
+          To distribute  the debug script file, first copy the file to the dfs.
+          The file can be distributed by setting the property 
+          "mapred.cache.files" with value "path"#"script-name". 
+          If more than one file has to be distributed, the files can be added
+          as comma separated paths. This property can also be set by APIs
+          <a href="api/org/apache/hadoop/filecache/DistributedCache.html#addCacheFile(java.net.URI,%20org.apache.hadoop.conf.Configuration)">
+          DistributedCache.addCacheFile(URI,conf) </a> and
+          <a href="api/org/apache/hadoop/filecache/DistributedCache.html#setCacheFiles(java.net.URI[],%20org.apache.hadoop.conf.Configuration)">
+          DistributedCache.setCacheFiles(URIs,conf) </a> where URI is of 
+          the form "hdfs://host:port/'absolutepath'#'script-name'". 
+          For Streaming, the file can be added through 
+          command line option -cacheFile.
+          </p>
+<p>
+          The files has to be symlinked in the current working directory of 
+          of the task. To create symlink for the file, the property 
+          "mapred.create.symlink" is set to "yes". This can also be set by
+          <a href="api/org/apache/hadoop/filecache/DistributedCache.html#createSymlink(org.apache.hadoop.conf.Configuration)">
+          DistributedCache.createSymLink(Configuration) </a> api.
+          </p>
+<a name="N10B94"></a><a name="How+to+submit+script%3A"></a>
+<h5> How to submit script: </h5>
+<p> A quick way to submit debug script is to set values for the 
+          properties "mapred.map.task.debug.script" and 
+          "mapred.reduce.task.debug.script" for debugging map task and reduce
+          task respectively. These properties can also be set by using APIs 
+          <a href="api/org/apache/hadoop/mapred/JobConf.html#setMapDebugScript(java.lang.String)">
+          JobConf.setMapDebugScript(String) </a> and
+          <a href="api/org/apache/hadoop/mapred/JobConf.html#setReduceDebugScript(java.lang.String)">
+          JobConf.setReduceDebugScript(String) </a>. For streaming, debug 
+          script can be submitted with command-line options -mapdebug,
+          -reducedebug for debugging mapper and reducer respectively.</p>
+<p>The arguments of the script are task's stdout, stderr, 
+          syslog and jobconf files. The debug command, run on the node where
+          the map/reduce failed, is: <br>
+          
+<span class="codefrag"> $script $stdout $stderr $syslog $jobconf </span> 
+</p>
+<p> Pipes programs have the c++ program name as a fifth argument
+          for the command. Thus for the pipes programs the command is <br> 
+          
+<span class="codefrag">$script $stdout $stderr $syslog $jobconf $program </span>
 
+          
+</p>
+<a name="N10BB6"></a><a name="Default+Behavior%3A"></a>
+<h5> Default Behavior: </h5>
+<p> For pipes, a default script is run to process core dumps under
+          gdb, prints stack trace and gives info about running threads. </p>
+<a name="N10BC1"></a><a name="JobControl"></a>
 <h4>JobControl</h4>
 <p>
 <a href="api/org/apache/hadoop/mapred/jobcontrol/package-summary.html">
           JobControl</a> is a utility which encapsulates a set of Map-Reduce jobs
           and their dependencies.</p>
-<a name="N10B7C"></a><a name="Data+Compression"></a>
+<a name="N10BCE"></a><a name="Data+Compression"></a>
 <h4>Data Compression</h4>
 <p>Hadoop Map-Reduce provides facilities for the application-writer to
           specify compression for both intermediate map-outputs and the
@@ -1943,7 +2008,7 @@
           codecs for reasons of both performance (zlib) and non-availability of
           Java libraries (lzo). More details on their usage and availability are
           available <a href="native_libraries.html">here</a>.</p>
-<a name="N10B9C"></a><a name="Intermediate+Outputs"></a>
+<a name="N10BEE"></a><a name="Intermediate+Outputs"></a>
 <h5>Intermediate Outputs</h5>
 <p>Applications can control compression of intermediate map-outputs
             via the 
@@ -1964,7 +2029,7 @@
             <a href="api/org/apache/hadoop/mapred/JobConf.html#setMapOutputCompressionType(org.apache.hadoop.io.SequenceFile.CompressionType)">
             JobConf.setMapOutputCompressionType(SequenceFile.CompressionType)</a> 
             api.</p>
-<a name="N10BC8"></a><a name="Job+Outputs"></a>
+<a name="N10C1A"></a><a name="Job+Outputs"></a>
 <h5>Job Outputs</h5>
 <p>Applications can control compression of job-outputs via the
             <a href="api/org/apache/hadoop/mapred/OutputFormatBase.html#setCompressOutput(org.apache.hadoop.mapred.JobConf,%20boolean)">
@@ -1984,7 +2049,7 @@
 </div>
 
     
-<a name="N10BF7"></a><a name="Example%3A+WordCount+v2.0"></a>
+<a name="N10C49"></a><a name="Example%3A+WordCount+v2.0"></a>
 <h2 class="h3">Example: WordCount v2.0</h2>
 <div class="section">
 <p>Here is a more complete <span class="codefrag">WordCount</span> which
uses many of the
@@ -1994,7 +2059,7 @@
       <a href="quickstart.html#SingleNodeSetup">pseudo-distributed</a> or
       <a href="quickstart.html#Fully-Distributed+Operation">fully-distributed</a>

       Hadoop installation.</p>
-<a name="N10C11"></a><a name="Source+Code-N10C11"></a>
+<a name="N10C63"></a><a name="Source+Code-N10C63"></a>
 <h3 class="h4">Source Code</h3>
 <table class="ForrestTable" cellspacing="1" cellpadding="4">
           
@@ -3204,7 +3269,7 @@
 </tr>
         
 </table>
-<a name="N11373"></a><a name="Sample+Runs"></a>
+<a name="N113C5"></a><a name="Sample+Runs"></a>
 <h3 class="h4">Sample Runs</h3>
 <p>Sample text-files as input:</p>
 <p>
@@ -3372,7 +3437,7 @@
 <br>
         
 </p>
-<a name="N11447"></a><a name="Highlights"></a>
+<a name="N11499"></a><a name="Highlights"></a>
 <h3 class="h4">Highlights</h3>
 <p>The second version of <span class="codefrag">WordCount</span> improves
upon the 
         previous one by using some features offered by the Map-Reduce framework:



Mime
View raw message