hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From acmur...@apache.org
Subject svn commit: r611887 [1/3] - in /lucene/hadoop/trunk: CHANGES.txt docs/mapred_tutorial.html docs/mapred_tutorial.pdf src/docs/src/documentation/content/xdocs/mapred_tutorial.xml
Date Mon, 14 Jan 2008 18:43:33 GMT
Author: acmurthy
Date: Mon Jan 14 10:43:32 2008
New Revision: 611887

URL: http://svn.apache.org/viewvc?rev=611887&view=rev
Log:
HADOOP-2574. Fixed mapred_tutorial.xml to correct minor errors with the WordCount examples.

Modified:
    lucene/hadoop/trunk/CHANGES.txt
    lucene/hadoop/trunk/docs/mapred_tutorial.html
    lucene/hadoop/trunk/docs/mapred_tutorial.pdf
    lucene/hadoop/trunk/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml

Modified: lucene/hadoop/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/lucene/hadoop/trunk/CHANGES.txt?rev=611887&r1=611886&r2=611887&view=diff
==============================================================================
--- lucene/hadoop/trunk/CHANGES.txt (original)
+++ lucene/hadoop/trunk/CHANGES.txt Mon Jan 14 10:43:32 2008
@@ -449,6 +449,9 @@
     HADOOP-2570. "work" directory created unconditionally, and symlinks
     created from the task cwds.
 
+    HADOOP-2574. Fixed mapred_tutorial.xml to correct minor errors with the
+    WordCount examples. (acmurthy) 
+
 Release 0.15.2 - 2008-01-02
 
   BUG FIXES

Modified: lucene/hadoop/trunk/docs/mapred_tutorial.html
URL: http://svn.apache.org/viewvc/lucene/hadoop/trunk/docs/mapred_tutorial.html?rev=611887&r1=611886&r2=611887&view=diff
==============================================================================
--- lucene/hadoop/trunk/docs/mapred_tutorial.html (original)
+++ lucene/hadoop/trunk/docs/mapred_tutorial.html Mon Jan 14 10:43:32 2008
@@ -277,13 +277,13 @@
 <a href="#Example%3A+WordCount+v2.0">Example: WordCount v2.0</a>
 <ul class="minitoc">
 <li>
-<a href="#Source+Code-N10B9C">Source Code</a>
+<a href="#Source+Code-N10BBD">Source Code</a>
 </li>
 <li>
 <a href="#Sample+Runs">Sample Runs</a>
 </li>
 <li>
-<a href="#Salient+Points">Salient Points</a>
+<a href="#Highlights">Highlights</a>
 </li>
 </ul>
 </li>
@@ -420,7 +420,12 @@
 <p>
 <span class="codefrag">WordCount</span> is a simple application that counts the number of
       occurences of each word in a given input set.</p>
-<a name="N100DA"></a><a name="Source+Code"></a>
+<p>This works with a 
+      <a href="quickstart.html#Standalone+Operation">local-standalone</a>,
+      <a href="quickstart.html#SingleNodeSetup">pseudo-distributed</a> or
+      <a href="quickstart.html#Fully-Distributed+Operation">fully-distributed</a> 
+      Hadoop installation.</p>
+<a name="N100E9"></a><a name="Source+Code"></a>
 <h3 class="h4">Source Code</h3>
 <table class="ForrestTable" cellspacing="1" cellpadding="4">
           
@@ -451,7 +456,7 @@
             
 <td colspan="1" rowspan="1">3.</td>
             <td colspan="1" rowspan="1">
-              <span class="codefrag">import java.io.Exception;</span>
+              <span class="codefrag">import java.io.IOException;</span>
             </td>
           
 </tr>
@@ -546,7 +551,7 @@
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;
               <span class="codefrag">
-                public static class MapClass extends MapReduceBase 
+                public static class Map extends MapReduceBase 
                 implements Mapper&lt;LongWritable, Text, Text, IntWritable&gt; {
               </span>
             </td>
@@ -860,7 +865,7 @@
 <td colspan="1" rowspan="1">45.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
-              <span class="codefrag">conf.setMapperClass(MapClass.class);</span>
+              <span class="codefrag">conf.setMapperClass(Map.class);</span>
             </td>
           
 </tr>
@@ -924,7 +929,7 @@
 <td colspan="1" rowspan="1">52.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
-              <span class="codefrag">conf.setInputPath(new Path(args[1]));</span>
+              <span class="codefrag">conf.setInputPath(new Path(args[0]));</span>
             </td>
           
 </tr>
@@ -934,7 +939,7 @@
 <td colspan="1" rowspan="1">53.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
-              <span class="codefrag">conf.setOutputPath(new Path(args[2]));</span>
+              <span class="codefrag">conf.setOutputPath(new Path(args[1]));</span>
             </td>
           
 </tr>
@@ -983,20 +988,23 @@
 </tr>
         
 </table>
-<a name="N1045C"></a><a name="Usage"></a>
+<a name="N1046B"></a><a name="Usage"></a>
 <h3 class="h4">Usage</h3>
 <p>Assuming <span class="codefrag">HADOOP_HOME</span> is the root of the installation and 
         <span class="codefrag">HADOOP_VERSION</span> is the Hadoop version installed, compile 
         <span class="codefrag">WordCount.java</span> and create a jar:</p>
 <p>
           
+<span class="codefrag">$ mkdir wordcount_classes</span>
+<br>
+          
 <span class="codefrag">
             $ javac -classpath ${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar 
-              WordCount.java
+              -d wordcount_classes WordCount.java
           </span>
 <br>
           
-<span class="codefrag">$ jar -cvf /usr/joe/wordcount.jar WordCount.class</span> 
+<span class="codefrag">$ jar -cvf /usr/joe/wordcount.jar -C wordcount_classes/ .</span> 
         
 </p>
 <p>Assuming that:</p>
@@ -1075,7 +1083,7 @@
 <br>
         
 </p>
-<a name="N104D8"></a><a name="Walk-through"></a>
+<a name="N104EB"></a><a name="Walk-through"></a>
 <h3 class="h4">Walk-through</h3>
 <p>The <span class="codefrag">WordCount</span> application is quite straight-forward.</p>
 <p>The <span class="codefrag">Mapper</span> implementation (lines 14-26), via the 
@@ -1185,7 +1193,7 @@
 </div>
     
     
-<a name="N1058F"></a><a name="Map-Reduce+-+User+Interfaces"></a>
+<a name="N105A2"></a><a name="Map-Reduce+-+User+Interfaces"></a>
 <h2 class="h3">Map-Reduce - User Interfaces</h2>
 <div class="section">
 <p>This section provides a reasonable amount of detail on every user-facing 
@@ -1204,12 +1212,12 @@
 <p>Finally, we will wrap up by discussing some useful features of the
       framework such as the <span class="codefrag">DistributedCache</span>, 
       <span class="codefrag">IsolationRunner</span> etc.</p>
-<a name="N105C8"></a><a name="Payload"></a>
+<a name="N105DB"></a><a name="Payload"></a>
 <h3 class="h4">Payload</h3>
 <p>Applications typically implement the <span class="codefrag">Mapper</span> and 
         <span class="codefrag">Reducer</span> interfaces to provide the <span class="codefrag">map</span> and 
         <span class="codefrag">reduce</span> methods. These form the core of the job.</p>
-<a name="N105DD"></a><a name="Mapper"></a>
+<a name="N105F0"></a><a name="Mapper"></a>
 <h4>Mapper</h4>
 <p>
 <a href="api/org/apache/hadoop/mapred/Mapper.html">
@@ -1265,7 +1273,7 @@
           <a href="api/org/apache/hadoop/io/compress/CompressionCodec.html">
           CompressionCodec</a> to be used via the <span class="codefrag">JobConf</span>.
           </p>
-<a name="N10657"></a><a name="How+Many+Maps%3F"></a>
+<a name="N1066A"></a><a name="How+Many+Maps%3F"></a>
 <h5>How Many Maps?</h5>
 <p>The number of maps is usually driven by the total size of the 
             inputs, that is, the total number of blocks of the input files.</p>
@@ -1278,7 +1286,7 @@
             <a href="api/org/apache/hadoop/mapred/JobConf.html#setNumMapTasks(int)">
             setNumMapTasks(int)</a> (which only provides a hint to the framework) 
             is used to set it even higher.</p>
-<a name="N1066F"></a><a name="Reducer"></a>
+<a name="N10682"></a><a name="Reducer"></a>
 <h4>Reducer</h4>
 <p>
 <a href="api/org/apache/hadoop/mapred/Reducer.html">
@@ -1301,18 +1309,18 @@
 <p>
 <span class="codefrag">Reducer</span> has 3 primary phases: shuffle, sort and reduce.
           </p>
-<a name="N1069F"></a><a name="Shuffle"></a>
+<a name="N106B2"></a><a name="Shuffle"></a>
 <h5>Shuffle</h5>
 <p>Input to the <span class="codefrag">Reducer</span> is the sorted output of the
             mappers. In this phase the framework fetches the relevant partition 
             of the output of all the mappers, via HTTP.</p>
-<a name="N106AC"></a><a name="Sort"></a>
+<a name="N106BF"></a><a name="Sort"></a>
 <h5>Sort</h5>
 <p>The framework groups <span class="codefrag">Reducer</span> inputs by keys (since 
             different mappers may have output the same key) in this stage.</p>
 <p>The shuffle and sort phases occur simultaneously; while 
             map-outputs are being fetched they are merged.</p>
-<a name="N106BB"></a><a name="Secondary+Sort"></a>
+<a name="N106CE"></a><a name="Secondary+Sort"></a>
 <h5>Secondary Sort</h5>
 <p>If equivalence rules for grouping the intermediate keys are 
               required to be different from those for grouping keys before 
@@ -1323,7 +1331,7 @@
               JobConf.setOutputKeyComparatorClass(Class)</a> can be used to 
               control how intermediate keys are grouped, these can be used in 
               conjunction to simulate <em>secondary sort on values</em>.</p>
-<a name="N106D4"></a><a name="Reduce"></a>
+<a name="N106E7"></a><a name="Reduce"></a>
 <h5>Reduce</h5>
 <p>In this phase the 
             <a href="api/org/apache/hadoop/mapred/Reducer.html#reduce(K2, java.util.Iterator, org.apache.hadoop.mapred.OutputCollector, org.apache.hadoop.mapred.Reporter)">
@@ -1339,7 +1347,7 @@
             progress, set application-level status messages and update 
             <span class="codefrag">Counters</span>, or just indicate that they are alive.</p>
 <p>The output of the <span class="codefrag">Reducer</span> is <em>not sorted</em>.</p>
-<a name="N10702"></a><a name="How+Many+Reduces%3F"></a>
+<a name="N10715"></a><a name="How+Many+Reduces%3F"></a>
 <h5>How Many Reduces?</h5>
 <p>The right number of reduces seems to be <span class="codefrag">0.95</span> or 
             <span class="codefrag">1.75</span> multiplied by (&lt;<em>no. of nodes</em>&gt; * 
@@ -1354,7 +1362,7 @@
 <p>The scaling factors above are slightly less than whole numbers to 
             reserve a few reduce slots in the framework for speculative-tasks and
             failed tasks.</p>
-<a name="N10727"></a><a name="Reducer+NONE"></a>
+<a name="N1073A"></a><a name="Reducer+NONE"></a>
 <h5>Reducer NONE</h5>
 <p>It is legal to set the number of reduce-tasks to <em>zero</em> if 
             no reduction is desired.</p>
@@ -1364,7 +1372,7 @@
             setOutputPath(Path)</a>. The framework does not sort the 
             map-outputs before writing them out to the <span class="codefrag">FileSystem</span>.
             </p>
-<a name="N10742"></a><a name="Partitioner"></a>
+<a name="N10755"></a><a name="Partitioner"></a>
 <h4>Partitioner</h4>
 <p>
 <a href="api/org/apache/hadoop/mapred/Partitioner.html">
@@ -1378,7 +1386,7 @@
 <p>
 <a href="api/org/apache/hadoop/mapred/lib/HashPartitioner.html">
           HashPartitioner</a> is the default <span class="codefrag">Partitioner</span>.</p>
-<a name="N10761"></a><a name="Reporter"></a>
+<a name="N10774"></a><a name="Reporter"></a>
 <h4>Reporter</h4>
 <p>
 <a href="api/org/apache/hadoop/mapred/Reporter.html">
@@ -1397,7 +1405,7 @@
           </p>
 <p>Applications can also update <span class="codefrag">Counters</span> using the 
           <span class="codefrag">Reporter</span>.</p>
-<a name="N1078B"></a><a name="OutputCollector"></a>
+<a name="N1079E"></a><a name="OutputCollector"></a>
 <h4>OutputCollector</h4>
 <p>
 <a href="api/org/apache/hadoop/mapred/OutputCollector.html">
@@ -1408,7 +1416,7 @@
 <p>Hadoop Map-Reduce comes bundled with a 
         <a href="api/org/apache/hadoop/mapred/lib/package-summary.html">
         library</a> of generally useful mappers, reducers, and partitioners.</p>
-<a name="N107A6"></a><a name="Job+Configuration"></a>
+<a name="N107B9"></a><a name="Job+Configuration"></a>
 <h3 class="h4">Job Configuration</h3>
 <p>
 <a href="api/org/apache/hadoop/mapred/JobConf.html">
@@ -1463,7 +1471,7 @@
         <a href="api/org/apache/hadoop/conf/Configuration.html#set(java.lang.String, java.lang.String)">set(String, String)</a>/<a href="api/org/apache/hadoop/conf/Configuration.html#get(java.lang.String, java.lang.String)">get(String, String)</a>
         to set/get arbitrary parameters needed by applications. However, use the 
         <span class="codefrag">DistributedCache</span> for large amounts of (read-only) data.</p>
-<a name="N10830"></a><a name="Task+Execution+%26+Environment"></a>
+<a name="N10843"></a><a name="Task+Execution+%26+Environment"></a>
 <h3 class="h4">Task Execution &amp; Environment</h3>
 <p>The <span class="codefrag">TaskTracker</span> executes the <span class="codefrag">Mapper</span>/ 
         <span class="codefrag">Reducer</span>  <em>task</em> as a child process in a separate jvm.
@@ -1523,7 +1531,7 @@
         loaded via <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/lang/System.html#loadLibrary(java.lang.String)">
         System.loadLibrary</a> or <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/lang/System.html#load(java.lang.String)">
         System.load</a>.</p>
-<a name="N108A5"></a><a name="Job+Submission+and+Monitoring"></a>
+<a name="N108B8"></a><a name="Job+Submission+and+Monitoring"></a>
 <h3 class="h4">Job Submission and Monitoring</h3>
 <p>
 <a href="api/org/apache/hadoop/mapred/JobClient.html">
@@ -1559,7 +1567,7 @@
 <p>Normally the user creates the application, describes various facets 
         of the job via <span class="codefrag">JobConf</span>, and then uses the 
         <span class="codefrag">JobClient</span> to submit the job and monitor its progress.</p>
-<a name="N108E3"></a><a name="Job+Control"></a>
+<a name="N108F6"></a><a name="Job+Control"></a>
 <h4>Job Control</h4>
 <p>Users may need to chain map-reduce jobs to accomplish complex
           tasks which cannot be done via a single map-reduce job. This is fairly
@@ -1595,7 +1603,7 @@
             </li>
           
 </ul>
-<a name="N1090D"></a><a name="Job+Input"></a>
+<a name="N10920"></a><a name="Job+Input"></a>
 <h3 class="h4">Job Input</h3>
 <p>
 <a href="api/org/apache/hadoop/mapred/InputFormat.html">
@@ -1643,7 +1651,7 @@
         appropriate <span class="codefrag">CompressionCodec</span>. However, it must be noted that
         compressed files with the above extensions cannot be <em>split</em> and 
         each compressed file is processed in its entirety by a single mapper.</p>
-<a name="N10977"></a><a name="InputSplit"></a>
+<a name="N1098A"></a><a name="InputSplit"></a>
 <h4>InputSplit</h4>
 <p>
 <a href="api/org/apache/hadoop/mapred/InputSplit.html">
@@ -1657,7 +1665,7 @@
           FileSplit</a> is the default <span class="codefrag">InputSplit</span>. It sets 
           <span class="codefrag">map.input.file</span> to the path of the input file for the
           logical split.</p>
-<a name="N1099C"></a><a name="RecordReader"></a>
+<a name="N109AF"></a><a name="RecordReader"></a>
 <h4>RecordReader</h4>
 <p>
 <a href="api/org/apache/hadoop/mapred/RecordReader.html">
@@ -1669,7 +1677,7 @@
           for processing. <span class="codefrag">RecordReader</span> thus assumes the 
           responsibility of processing record boundaries and presents the tasks 
           with keys and values.</p>
-<a name="N109BF"></a><a name="Job+Output"></a>
+<a name="N109D2"></a><a name="Job+Output"></a>
 <h3 class="h4">Job Output</h3>
 <p>
 <a href="api/org/apache/hadoop/mapred/OutputFormat.html">
@@ -1694,7 +1702,7 @@
 <p>
 <span class="codefrag">TextOutputFormat</span> is the default 
         <span class="codefrag">OutputFormat</span>.</p>
-<a name="N109E8"></a><a name="Task+Side-Effect+Files"></a>
+<a name="N109FB"></a><a name="Task+Side-Effect+Files"></a>
 <h4>Task Side-Effect Files</h4>
 <p>In some applications, component tasks need to create and/or write to
           side-files, which differ from the actual job-output files.</p>
@@ -1720,7 +1728,7 @@
           JobConf.getOutputPath()</a>, and the framework will promote them 
           similarly for succesful task-attempts, thus eliminating the need to 
           pick unique paths per task-attempt.</p>
-<a name="N10A1D"></a><a name="RecordWriter"></a>
+<a name="N10A30"></a><a name="RecordWriter"></a>
 <h4>RecordWriter</h4>
 <p>
 <a href="api/org/apache/hadoop/mapred/RecordWriter.html">
@@ -1728,9 +1736,9 @@
           pairs to an output file.</p>
 <p>RecordWriter implementations write the job outputs to the 
           <span class="codefrag">FileSystem</span>.</p>
-<a name="N10A34"></a><a name="Other+Useful+Features"></a>
+<a name="N10A47"></a><a name="Other+Useful+Features"></a>
 <h3 class="h4">Other Useful Features</h3>
-<a name="N10A3A"></a><a name="Counters"></a>
+<a name="N10A4D"></a><a name="Counters"></a>
 <h4>Counters</h4>
 <p>
 <span class="codefrag">Counters</span> represent global counters, defined either by 
@@ -1744,7 +1752,7 @@
           Reporter.incrCounter(Enum, long)</a> in the <span class="codefrag">map</span> and/or 
           <span class="codefrag">reduce</span> methods. These counters are then globally 
           aggregated by the framework.</p>
-<a name="N10A65"></a><a name="DistributedCache"></a>
+<a name="N10A78"></a><a name="DistributedCache"></a>
 <h4>DistributedCache</h4>
 <p>
 <a href="api/org/apache/hadoop/filecache/DistributedCache.html">
@@ -1777,7 +1785,7 @@
           <a href="api/org/apache/hadoop/filecache/DistributedCache.html#createSymlink(org.apache.hadoop.conf.Configuration)">
           DistributedCache.createSymlink(Path, Configuration)</a> api. Files 
           have <em>execution permissions</em> set.</p>
-<a name="N10AA3"></a><a name="Tool"></a>
+<a name="N10AB6"></a><a name="Tool"></a>
 <h4>Tool</h4>
 <p>The <a href="api/org/apache/hadoop/util/Tool.html">Tool</a> 
           interface supports the handling of generic Hadoop command-line options.
@@ -1817,7 +1825,7 @@
             </span>
           
 </p>
-<a name="N10AD5"></a><a name="IsolationRunner"></a>
+<a name="N10AE8"></a><a name="IsolationRunner"></a>
 <h4>IsolationRunner</h4>
 <p>
 <a href="api/org/apache/hadoop/mapred/IsolationRunner.html">
@@ -1841,13 +1849,13 @@
 <p>
 <span class="codefrag">IsolationRunner</span> will run the failed task in a single 
           jvm, which can be in the debugger, over precisely the same input.</p>
-<a name="N10B08"></a><a name="JobControl"></a>
+<a name="N10B1B"></a><a name="JobControl"></a>
 <h4>JobControl</h4>
 <p>
 <a href="api/org/apache/hadoop/mapred/jobcontrol/package-summary.html">
           JobControl</a> is a utility which encapsulates a set of Map-Reduce jobs
           and their dependencies.</p>
-<a name="N10B15"></a><a name="Data+Compression"></a>
+<a name="N10B28"></a><a name="Data+Compression"></a>
 <h4>Data Compression</h4>
 <p>Hadoop Map-Reduce provides facilities for the application-writer to
           specify compression for both intermediate map-outputs and the
@@ -1861,7 +1869,7 @@
           codecs for reasons of both performance (zlib) and non-availability of
           Java libraries (lzo). More details on their usage and availability are
           available <a href="native_libraries.html">here</a>.</p>
-<a name="N10B35"></a><a name="Intermediate+Outputs"></a>
+<a name="N10B48"></a><a name="Intermediate+Outputs"></a>
 <h5>Intermediate Outputs</h5>
 <p>Applications can control compression of intermediate map-outputs
             via the 
@@ -1882,7 +1890,7 @@
             <a href="api/org/apache/hadoop/mapred/JobConf.html#setMapOutputCompressionType(org.apache.hadoop.io.SequenceFile.CompressionType)">
             JobConf.setMapOutputCompressionType(SequenceFile.CompressionType)</a> 
             api.</p>
-<a name="N10B61"></a><a name="Job+Outputs"></a>
+<a name="N10B74"></a><a name="Job+Outputs"></a>
 <h5>Job Outputs</h5>
 <p>Applications can control compression of job-outputs via the
             <a href="api/org/apache/hadoop/mapred/OutputFormatBase.html#setCompressOutput(org.apache.hadoop.mapred.JobConf,%20boolean)">
@@ -1902,12 +1910,17 @@
 </div>
 
     
-<a name="N10B90"></a><a name="Example%3A+WordCount+v2.0"></a>
+<a name="N10BA3"></a><a name="Example%3A+WordCount+v2.0"></a>
 <h2 class="h3">Example: WordCount v2.0</h2>
 <div class="section">
 <p>Here is a more complete <span class="codefrag">WordCount</span> which uses many of the
-      features provided by the Map-Reduce framework we discussed so far:</p>
-<a name="N10B9C"></a><a name="Source+Code-N10B9C"></a>
+      features provided by the Map-Reduce framework we discussed so far.</p>
+<p>This needs the HDFS to be up and running, especially for the 
+      <span class="codefrag">DistributedCache</span>-related features. Hence it only works with a 
+      <a href="quickstart.html#SingleNodeSetup">pseudo-distributed</a> or
+      <a href="quickstart.html#Fully-Distributed+Operation">fully-distributed</a> 
+      Hadoop installation.</p>
+<a name="N10BBD"></a><a name="Source+Code-N10BBD"></a>
 <h3 class="h4">Source Code</h3>
 <table class="ForrestTable" cellspacing="1" cellpadding="4">
           
@@ -2042,7 +2055,7 @@
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;
               <span class="codefrag">
-                public static class MapClass extends MapReduceBase 
+                public static class Map extends MapReduceBase 
                 implements Mapper&lt;LongWritable, Text, Text, IntWritable&gt; {
               </span>
             </td>
@@ -2202,7 +2215,7 @@
 <td colspan="1" rowspan="1">32.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
-              <span class="codefrag">Path[] patternsFiles = new Path[0];</span>
+              <span class="codefrag">if (job.getBoolean("wordcount.skip.patterns", false)) {</span>
             </td>
           
 </tr>
@@ -2211,8 +2224,8 @@
             
 <td colspan="1" rowspan="1">33.</td>
             <td colspan="1" rowspan="1">
-              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
-              <span class="codefrag">try {</span>
+              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
+              <span class="codefrag">Path[] patternsFiles = new Path[0];</span>
             </td>
           
 </tr>
@@ -2222,6 +2235,16 @@
 <td colspan="1" rowspan="1">34.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
+              <span class="codefrag">try {</span>
+            </td>
+          
+</tr>
+          
+<tr>
+            
+<td colspan="1" rowspan="1">35.</td>
+            <td colspan="1" rowspan="1">
+              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">
                 patternsFiles = DistributedCache.getLocalCacheFiles(job);
               </span>
@@ -2231,9 +2254,9 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">35.</td>
+<td colspan="1" rowspan="1">36.</td>
             <td colspan="1" rowspan="1">
-              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
+              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">} catch (IOException ioe) {</span>
             </td>
           
@@ -2241,9 +2264,9 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">36.</td>
+<td colspan="1" rowspan="1">37.</td>
             <td colspan="1" rowspan="1">
-              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
+              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">
                 System.err.println("Caught exception while getting cached files: " 
                 + StringUtils.stringifyException(ioe));
@@ -2254,9 +2277,9 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">37.</td>
+<td colspan="1" rowspan="1">38.</td>
             <td colspan="1" rowspan="1">
-              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
+              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">}</span>
             </td>
           
@@ -2264,9 +2287,9 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">38.</td>
+<td colspan="1" rowspan="1">39.</td>
             <td colspan="1" rowspan="1">
-              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
+              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">for (Path patternsFile : patternsFiles) {</span>
             </td>
           
@@ -2274,9 +2297,9 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">39.</td>
+<td colspan="1" rowspan="1">40.</td>
             <td colspan="1" rowspan="1">
-              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
+              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">parseSkipFile(patternsFile);</span>
             </td>
           
@@ -2284,7 +2307,17 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">40.</td>
+<td colspan="1" rowspan="1">41.</td>
+            <td colspan="1" rowspan="1">
+              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
+              <span class="codefrag">}</span>
+            </td>
+          
+</tr>
+          
+<tr>
+            
+<td colspan="1" rowspan="1">42.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">}</span>
@@ -2294,7 +2327,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">41.</td>
+<td colspan="1" rowspan="1">43.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">}</span>
@@ -2304,14 +2337,14 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">42.</td>
+<td colspan="1" rowspan="1">44.</td>
             <td colspan="1" rowspan="1"></td>
           
 </tr>
           
 <tr>
             
-<td colspan="1" rowspan="1">43.</td>
+<td colspan="1" rowspan="1">45.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">private void parseSkipFile(Path patternsFile) {</span>
@@ -2321,7 +2354,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">44.</td>
+<td colspan="1" rowspan="1">46.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">try {</span>
@@ -2331,7 +2364,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">45.</td>
+<td colspan="1" rowspan="1">47.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">
@@ -2344,7 +2377,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">46.</td>
+<td colspan="1" rowspan="1">48.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">String pattern = null;</span>
@@ -2354,7 +2387,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">47.</td>
+<td colspan="1" rowspan="1">49.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">while ((pattern = fis.readLine()) != null) {</span>
@@ -2364,7 +2397,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">48.</td>
+<td colspan="1" rowspan="1">50.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">patternsToSkip.add(pattern);</span>
@@ -2374,7 +2407,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">49.</td>
+<td colspan="1" rowspan="1">51.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">}</span>
@@ -2384,7 +2417,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">50.</td>
+<td colspan="1" rowspan="1">52.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">} catch (IOException ioe) {</span>
@@ -2394,7 +2427,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">51.</td>
+<td colspan="1" rowspan="1">53.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">
@@ -2409,7 +2442,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">52.</td>
+<td colspan="1" rowspan="1">54.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">}</span>
@@ -2419,7 +2452,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">53.</td>
+<td colspan="1" rowspan="1">55.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">}</span>
@@ -2429,14 +2462,14 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">54.</td>
+<td colspan="1" rowspan="1">56.</td>
             <td colspan="1" rowspan="1"></td>
           
 </tr>
           
 <tr>
             
-<td colspan="1" rowspan="1">55.</td>
+<td colspan="1" rowspan="1">57.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">
@@ -2450,7 +2483,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">56.</td>
+<td colspan="1" rowspan="1">58.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">
@@ -2464,14 +2497,14 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">57.</td>
+<td colspan="1" rowspan="1">59.</td>
             <td colspan="1" rowspan="1"></td>
           
 </tr>
           
 <tr>
             
-<td colspan="1" rowspan="1">58.</td>
+<td colspan="1" rowspan="1">60.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">for (String pattern : patternsToSkip) {</span>
@@ -2481,7 +2514,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">59.</td>
+<td colspan="1" rowspan="1">61.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">line = line.replaceAll(pattern, "");</span>
@@ -2491,7 +2524,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">60.</td>
+<td colspan="1" rowspan="1">62.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">}</span>
@@ -2501,14 +2534,14 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">61.</td>
+<td colspan="1" rowspan="1">63.</td>
             <td colspan="1" rowspan="1"></td>
           
 </tr>
           
 <tr>
             
-<td colspan="1" rowspan="1">62.</td>
+<td colspan="1" rowspan="1">64.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">StringTokenizer tokenizer = new StringTokenizer(line);</span>
@@ -2518,7 +2551,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">63.</td>
+<td colspan="1" rowspan="1">65.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">while (tokenizer.hasMoreTokens()) {</span>
@@ -2528,7 +2561,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">64.</td>
+<td colspan="1" rowspan="1">66.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">word.set(tokenizer.nextToken());</span>
@@ -2538,7 +2571,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">65.</td>
+<td colspan="1" rowspan="1">67.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">output.collect(word, one);</span>
@@ -2548,7 +2581,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">66.</td>
+<td colspan="1" rowspan="1">68.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">reporter.incrCounter(Counters.INPUT_WORDS, 1);</span>
@@ -2558,7 +2591,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">67.</td>
+<td colspan="1" rowspan="1">69.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">}</span>
@@ -2568,14 +2601,14 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">68.</td>
+<td colspan="1" rowspan="1">70.</td>
             <td colspan="1" rowspan="1"></td>
           
 </tr>
           
 <tr>
             
-<td colspan="1" rowspan="1">69.</td>
+<td colspan="1" rowspan="1">71.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">if ((++numRecords % 100) == 0) {</span>
@@ -2585,7 +2618,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">70.</td>
+<td colspan="1" rowspan="1">72.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">
@@ -2599,7 +2632,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">71.</td>
+<td colspan="1" rowspan="1">73.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">}</span>
@@ -2609,7 +2642,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">72.</td>
+<td colspan="1" rowspan="1">74.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">}</span>
@@ -2619,7 +2652,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">73.</td>
+<td colspan="1" rowspan="1">75.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;
               <span class="codefrag">}</span>
@@ -2629,14 +2662,14 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">74.</td>
+<td colspan="1" rowspan="1">76.</td>
             <td colspan="1" rowspan="1"></td>
           
 </tr>
           
 <tr>
             
-<td colspan="1" rowspan="1">75.</td>
+<td colspan="1" rowspan="1">77.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;
               <span class="codefrag">
@@ -2649,7 +2682,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">76.</td>
+<td colspan="1" rowspan="1">78.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">
@@ -2663,7 +2696,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">77.</td>
+<td colspan="1" rowspan="1">79.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">int sum = 0;</span>
@@ -2673,7 +2706,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">78.</td>
+<td colspan="1" rowspan="1">80.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">while (values.hasNext()) {</span>
@@ -2683,7 +2716,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">79.</td>
+<td colspan="1" rowspan="1">81.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">sum += values.next().get();</span>
@@ -2693,7 +2726,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">80.</td>
+<td colspan="1" rowspan="1">82.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">}</span>
@@ -2703,7 +2736,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">81.</td>
+<td colspan="1" rowspan="1">83.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">output.collect(key, new IntWritable(sum));</span>
@@ -2713,7 +2746,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">82.</td>
+<td colspan="1" rowspan="1">84.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">}</span>
@@ -2723,7 +2756,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">83.</td>
+<td colspan="1" rowspan="1">85.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;
               <span class="codefrag">}</span>
@@ -2733,14 +2766,14 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">84.</td>
+<td colspan="1" rowspan="1">86.</td>
             <td colspan="1" rowspan="1"></td>
           
 </tr>
           
 <tr>
             
-<td colspan="1" rowspan="1">85.</td>
+<td colspan="1" rowspan="1">87.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;
               <span class="codefrag">public int run(String[] args) throws Exception {</span>
@@ -2750,7 +2783,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">86.</td>
+<td colspan="1" rowspan="1">88.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">
@@ -2762,7 +2795,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">87.</td>
+<td colspan="1" rowspan="1">89.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">conf.setJobName("wordcount");</span>
@@ -2772,14 +2805,14 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">88.</td>
+<td colspan="1" rowspan="1">90.</td>
             <td colspan="1" rowspan="1"></td>
           
 </tr>
           
 <tr>
             
-<td colspan="1" rowspan="1">89.</td>
+<td colspan="1" rowspan="1">91.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">conf.setOutputKeyClass(Text.class);</span>
@@ -2789,7 +2822,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">90.</td>
+<td colspan="1" rowspan="1">92.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">conf.setOutputValueClass(IntWritable.class);</span>
@@ -2799,24 +2832,24 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">91.</td>
+<td colspan="1" rowspan="1">93.</td>
             <td colspan="1" rowspan="1"></td>
           
 </tr>
           
 <tr>
             
-<td colspan="1" rowspan="1">92.</td>
+<td colspan="1" rowspan="1">94.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
-              <span class="codefrag">conf.setMapperClass(MapClass.class);</span>
+              <span class="codefrag">conf.setMapperClass(Map.class);</span>
             </td>
           
 </tr>
           
 <tr>
             
-<td colspan="1" rowspan="1">93.</td>
+<td colspan="1" rowspan="1">95.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">conf.setCombinerClass(Reduce.class);</span>
@@ -2826,7 +2859,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">94.</td>
+<td colspan="1" rowspan="1">96.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">conf.setReducerClass(Reduce.class);</span>
@@ -2836,14 +2869,14 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">95.</td>
+<td colspan="1" rowspan="1">97.</td>
             <td colspan="1" rowspan="1"></td>
           
 </tr>
           
 <tr>
             
-<td colspan="1" rowspan="1">96.</td>
+<td colspan="1" rowspan="1">98.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">conf.setInputFormat(TextInputFormat.class);</span>
@@ -2853,7 +2886,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">97.</td>
+<td colspan="1" rowspan="1">99.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">conf.setOutputFormat(TextOutputFormat.class);</span>
@@ -2863,14 +2896,14 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">98.</td>
+<td colspan="1" rowspan="1">100.</td>
             <td colspan="1" rowspan="1"></td>
           
 </tr>
           
 <tr>
             
-<td colspan="1" rowspan="1">99.</td>
+<td colspan="1" rowspan="1">101.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">
@@ -2882,7 +2915,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">100.</td>
+<td colspan="1" rowspan="1">102.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">for (int i=0; i &lt; args.length; ++i) {</span>
@@ -2892,17 +2925,17 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">101.</td>
+<td colspan="1" rowspan="1">103.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
-              <span class="codefrag">if ("-skip".equals(args[i]) {</span>
+              <span class="codefrag">if ("-skip".equals(args[i])) {</span>
             </td>
           
 </tr>
           
 <tr>
             
-<td colspan="1" rowspan="1">102.</td>
+<td colspan="1" rowspan="1">104.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">
@@ -2914,7 +2947,19 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">103.</td>
+<td colspan="1" rowspan="1">105.</td>
+            <td colspan="1" rowspan="1">
+              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
+              <span class="codefrag">
+                conf.setBoolean("wordcount.skip.patterns", true);
+              </span>
+            </td>
+          
+</tr>
+          
+<tr>
+            
+<td colspan="1" rowspan="1">106.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">} else {</span>
@@ -2924,7 +2969,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">104.</td>
+<td colspan="1" rowspan="1">107.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">other_args.add(args[i]);</span>
@@ -2934,7 +2979,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">105.</td>
+<td colspan="1" rowspan="1">108.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">}</span>
@@ -2944,7 +2989,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">106.</td>
+<td colspan="1" rowspan="1">109.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">}</span>
@@ -2954,41 +2999,41 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">107.</td>
+<td colspan="1" rowspan="1">110.</td>
             <td colspan="1" rowspan="1"></td>
           
 </tr>
           
 <tr>
             
-<td colspan="1" rowspan="1">108.</td>
+<td colspan="1" rowspan="1">111.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
-              <span class="codefrag">conf.setInputPath(new Path(other_args[0]));</span>
+              <span class="codefrag">conf.setInputPath(new Path(other_args.get(0)));</span>
             </td>
           
 </tr>
           
 <tr>
             
-<td colspan="1" rowspan="1">109.</td>
+<td colspan="1" rowspan="1">112.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
-              <span class="codefrag">conf.setOutputPath(new Path(other_args[1]));</span>
+              <span class="codefrag">conf.setOutputPath(new Path(other_args.get(1)));</span>
             </td>
           
 </tr>
           
 <tr>
             
-<td colspan="1" rowspan="1">110.</td>
+<td colspan="1" rowspan="1">113.</td>
             <td colspan="1" rowspan="1"></td>
           
 </tr>
           
 <tr>
             
-<td colspan="1" rowspan="1">111.</td>
+<td colspan="1" rowspan="1">114.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">JobClient.runJob(conf);</span>
@@ -2998,7 +3043,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">112.</td>
+<td colspan="1" rowspan="1">115.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">return 0;</span>
@@ -3008,7 +3053,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">113.</td>
+<td colspan="1" rowspan="1">116.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;
               <span class="codefrag">}</span>
@@ -3018,14 +3063,14 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">114.</td>
+<td colspan="1" rowspan="1">117.</td>
             <td colspan="1" rowspan="1"></td>
           
 </tr>
           
 <tr>
             
-<td colspan="1" rowspan="1">115.</td>
+<td colspan="1" rowspan="1">118.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;
               <span class="codefrag">
@@ -3037,7 +3082,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">116.</td>
+<td colspan="1" rowspan="1">119.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">
@@ -3050,7 +3095,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">117.</td>
+<td colspan="1" rowspan="1">120.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               <span class="codefrag">System.exit(res);</span>
@@ -3060,7 +3105,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">118.</td>
+<td colspan="1" rowspan="1">121.</td>
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;
               <span class="codefrag">}</span>
@@ -3070,7 +3115,7 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">119.</td>
+<td colspan="1" rowspan="1">122.</td>
             <td colspan="1" rowspan="1">
               <span class="codefrag">}</span>
             </td>
@@ -3079,13 +3124,13 @@
           
 <tr>
             
-<td colspan="1" rowspan="1">120.</td>
+<td colspan="1" rowspan="1">123.</td>
             <td colspan="1" rowspan="1"></td>
           
 </tr>
         
 </table>
-<a name="N112CE"></a><a name="Sample+Runs"></a>
+<a name="N1131F"></a><a name="Sample+Runs"></a>
 <h3 class="h4">Sample Runs</h3>
 <p>Sample text-files as input:</p>
 <p>
@@ -3112,7 +3157,7 @@
 <span class="codefrag">$ bin/hadoop dfs -cat /usr/joe/wordcount/input/file02</span>
 <br>
           
-<span class="codefrag">Hello Hadoop, Goodbye the Hadoop.</span>
+<span class="codefrag">Hello Hadoop, Goodbye to hadoop.</span>
         
 </p>
 <p>Run the application:</p>
@@ -3142,9 +3187,6 @@
 <span class="codefrag">Hadoop,    1</span>
 <br>
           
-<span class="codefrag">Hadoop.    1</span>
-<br>
-          
 <span class="codefrag">Hello    2</span>
 <br>
           
@@ -3154,7 +3196,10 @@
 <span class="codefrag">World,    1</span>
 <br>
           
-<span class="codefrag">the    1</span>
+<span class="codefrag">hadoop.    1</span>
+<br>
+          
+<span class="codefrag">to    1</span>
 <br>
         
 </p>
@@ -3176,7 +3221,7 @@
 <span class="codefrag">\!</span>
 <br>
           
-<span class="codefrag">the</span>
+<span class="codefrag">to</span>
 <br>
         
 </p>
@@ -3205,7 +3250,7 @@
 <span class="codefrag">Goodbye    1</span>
 <br>
           
-<span class="codefrag">Hadoop    2</span>
+<span class="codefrag">Hadoop    1</span>
 <br>
           
 <span class="codefrag">Hello    2</span>
@@ -3213,6 +3258,9 @@
           
 <span class="codefrag">World    2</span>
 <br>
+          
+<span class="codefrag">hadoop    1</span>
+<br>
         
 </p>
 <p>Run it once more, this time switch-off case-sensitivity:</p>
@@ -3250,8 +3298,8 @@
 <br>
         
 </p>
-<a name="N1139E"></a><a name="Salient+Points"></a>
-<h3 class="h4">Salient Points</h3>
+<a name="N113F3"></a><a name="Highlights"></a>
+<h3 class="h4">Highlights</h3>
 <p>The second version of <span class="codefrag">WordCount</span> improves upon the 
         previous one by using some features offered by the Map-Reduce framework:
         </p>
@@ -3260,26 +3308,26 @@
 <li>
             Demonstrates how applications can access configuration parameters
             in the <span class="codefrag">configure</span> method of the <span class="codefrag">Mapper</span> (and
-            <span class="codefrag">Reducer</span>) implementations (lines 28-41).
+            <span class="codefrag">Reducer</span>) implementations (lines 28-43).
           </li>
           
 <li>
             Demonstrates how the <span class="codefrag">DistributedCache</span> can be used to 
             distribute read-only data needed by the jobs. Here it allows the user 
-            to specify word-patterns to skip while counting (line 102).
+            to specify word-patterns to skip while counting (line 104).
           </li>
           
 <li>
             Demonstrates the utility of the <span class="codefrag">Tool</span> interface and the
             <span class="codefrag">GenericOptionsParser</span> to handle generic Hadoop 
-            command-line options (lines 85-86, 116).
+            command-line options (lines 87-116, 119).
           </li>
           
 <li>
-            Demonstrates how applications can use <span class="codefrag">Counters</span> (line 66)
+            Demonstrates how applications can use <span class="codefrag">Counters</span> (line 68)
             and how they can set application-specific status information via 
             the <span class="codefrag">Reporter</span> instance passed to the <span class="codefrag">map</span> (and
-            <span class="codefrag">reduce</span>) method (line 70).
+            <span class="codefrag">reduce</span>) method (line 72).
           </li>
         
 </ul>



Mime
View raw message