hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "HowToDevelopUnitTests" by SteveLoughran
Date Mon, 16 Sep 2013 09:16:54 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "HowToDevelopUnitTests" page has been changed by SteveLoughran:
https://wiki.apache.org/hadoop/HowToDevelopUnitTests?action=diff&rev1=11&rev2=12

Comment:
explain the mini clusters

  
  This page contains Hadoop testing and test development guidelines.
  
+ == How Hadoop Unit Tests Work ==
+ 
+ Hadoop Unit tests are all designed to work on a local machine, rather than a full-scale
Hadoop cluster. The ongoing for for that is in Apache Bigtop.
+ 
+ The unit tests work by creating a miniDFS, MiniYARN and MiniMR clusters -as appropriate.
These all run the code of the specific services.
+ 
+ === MiniDFSCluster ===
+ 
+ {{{org.apache.hadoop.hdfs.MiniDFSCluster}}}
+ 
+ Emulates an HDFS cluster with the given number of (emulated) datanodes. After creating one
via its builder API; you can build up the HDFS URI {{{"hdfs://localhost:" + miniDFSCluster.getNameNodePort()}}}.
This can be used as the base URI for filesystem operations.
+ 
+ 
+ {{{#!java
+ File baseDir = new File("./target/hdfs/"+testName).getAbsoluteFile();
+ FileUtil.fullyDelete(baseDir)
+ conf.set(MiniDFSCluster.HDFS_MINIDFS_BASEDIR, baseDir.getAbsolutePath())
+ MiniDFSCluster.Builder builder = new MiniDFSCluster.Builder(conf)
+ MiniDFSCluster hdfsCluster = builder.build()
+ String hdfsURI = "hdfs://localhost:"+ hdfsCluster.getNameNodePort()}+"/"
+ 
+ 
+ === MiniYARNCluster ===
+ 
+ {{{org.apache.hadoop.yarn.server.MiniYARNCluster}}}
+ 
+ Starts the YARN Services in the JVM, with the given number of simulated Node Managers. You
can then submit work to the ResourceManager. The actual AMs (and any containers they themselves
execute code in) are actually executed in separate processes -as on a real YARN cluster. The
key difference is that the classpath of the test JVM is passed down to the spawned processes
(how? Which environment variable?) so that they pick up the same version of the Hadoop JARs.
+ 
+ {{{#!java
+ YarnConfiguration clusterConf = new YarnConfiguration();
+ conf.setInt(YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_MB, 64);
+ conf.setClass(YarnConfiguration.RM_SCHEDULER,
+               FifoScheduler.class, ResourceScheduler.class);
+ HoyaUtils.patchConfiguration(conf)
+ miniCluster = new MiniYARNCluster(name, noOfNodeManagers, numLocalDirs, numLogDirs)
+ miniCluster.init(conf)
+ miniCluster.start();
+ 
+ //once the cluster is created, you can get its configuration
+ //with the binding details to the cluster added from the minicluster
+ YarnConfiguration appConf = new YarnConfiguration(miniCluster.getConfig()),
+ 
+ }}}
+ 
+ The results of a test run end up saved into the filesystem, where then can be retrieved
by hand.
+ 
+ {{{
+ cat target/TestKillAM/TestKillAM-logDir-nm-0_0/application_1378993847080_0001/container_1378993847080_0001_01_000001/out.txt
+ }}}
+ 
+  1. The output is not automatically merged into the JUnit results (if anyone can fix this,
code would be welcome)
+  1. The output is formatted by whatever logging tools and configuration the AM and its containers
use -such as the specific version of {{{Apache Log4J}}} and {{{log4j.properties}}} are on
the classpath.
+  1. The name of the base directory and logdir is determined by the name given to the test
cluster -unique cluster names per test classes are invaluable.
+  1. The more node managers you create, the more log directories you will have to look into.
A single NM is easier to work with.
+  1. the application- and container- directory names vary every run.
+  1. You can {{{tail -f}}} the {{{out.txt}}} and {{{err.txt}}} files while the tests are
running.
+  1. {{{jps -v}}} will list the running applications; {{{kill}}} can then be used to kill
the processes, and so test the application's resilience to failures.
+ 
+ It's a bit inelegant to work with, but functional. The ability list and terminate the processes
makes writing failure simulation tests possible -which is important as production applications
need to be designed to handle failures of child containers.
+ 
+ === MiniMRYarnCluster ===
+ 
+ {{{org.apache.hadoop.mapreduce.v2.MiniMRYarnCluster}}}
+ 
+ This adds an MR History Server to the MiniYarnCluster, and extends the cluster configuration
to refer to it. MR applications can then easily talk to the RM to submit jobs, with the history
being preserved.
+ 
+ === Using the Mini clusters in tests ===
+ 
+ The clusters take time to set up and tear down, so should only be created once per test
class, in a {{{@BeforeClass}}}-tagged static class method. in an {{{@AfterClass}}} they should
be stopped. {{{MiniDFSCluster.shutdown()}}} and via the {{{stop()}}} method in the YARN clusters.
+ 
+ 
+ == Writing JUnit Tests ==
+ 
  === Cheat sheet of tests development for JUnit v4 ===
  
  Hadoop has been using JUnit4 for a while now, however it seems that many new tests are still
being developed for JUnit v3. It is partially JUnit's fault because for the false sense of
backward compatibility all v3 {{{junit.framework}}} classes are packaged along with v4 classes
and it all is called {{{junit-4.10.jar}}}. This is necessary to permit mixing of the old and
new tests, and to allow the new v4 tests to run under the existing JUnit test runners in IDEs
and build tools.
  
  Here's the short list of traps one need to be aware and not to develop yet another JUnit
v3 test case
  
-    * YES, new unit tests HAVE to be developed for JUnit v4. No patches which add v3 test
case classes will be approved. 
+    * YES, new unit tests HAVE to be developed for JUnit v4. No patches which add v3 test
case classes will be approved.
     * DO NOT use {{{junit.framework}}} imports
     * DO use only {{{org.junit}}} imports
     * DO NOT {{{extends TestCase}}} (now, you can create your own test class structures if
needed!)
@@ -53, +126 @@

  
   1. Use the JUnit assertions, not the Java {{{assert}}} statement.
   1. In equality tests, place the expected value first
-  1. Give assertions meaningful error messages. 
+  1. Give assertions meaningful error messages.
  
  === Bad ===
  
@@ -61, +134 @@

  /** a test */
  @Test
  public void testBuildVersion() {
-   Namenode nn = getNameNode(); 
+   Namenode nn = getNameNode();
    assertNotNull(nn);
    NamespaceInfo info = nn.versionRequest() ;
    assertEquals(info.getBuildVersion(),"32");
@@ -78, +151 @@

   */
  @Test
  public void testBuildVersion() {
-   Namenode nn = getNameNode(); 
+   Namenode nn = getNameNode();
    assertNotNull("No namenode", nn);
    NamespaceInfo info = nn.versionRequest() ;
    assertEquals("Build version wrong", "32", info.getBuildVersion());
@@ -138, +211 @@

  
  == References ==
  
-  * [[http://code.google.com/p/t2framework/wiki/JUnitQuickTutorial|Quick tutorial]] on the
JUnit website. 
+  * [[http://code.google.com/p/t2framework/wiki/JUnitQuickTutorial|Quick tutorial]] on the
JUnit website.
  

Mime
View raw message