From common-commits-return-11633-apmail-hadoop-common-commits-archive=hadoop.apache.org@hadoop.apache.org Tue Aug 17 22:19:43 2010 Return-Path: Delivered-To: apmail-hadoop-common-commits-archive@www.apache.org Received: (qmail 33911 invoked from network); 17 Aug 2010 22:19:42 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 17 Aug 2010 22:19:42 -0000 Received: (qmail 4124 invoked by uid 500); 17 Aug 2010 22:19:42 -0000 Delivered-To: apmail-hadoop-common-commits-archive@hadoop.apache.org Received: (qmail 4041 invoked by uid 500); 17 Aug 2010 22:19:42 -0000 Mailing-List: contact common-commits-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-dev@hadoop.apache.org Delivered-To: mailing list common-commits@hadoop.apache.org Received: (qmail 4034 invoked by uid 500); 17 Aug 2010 22:19:41 -0000 Delivered-To: apmail-hadoop-core-commits@hadoop.apache.org Received: (qmail 4031 invoked by uid 99); 17 Aug 2010 22:19:41 -0000 Received: from Unknown (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Aug 2010 22:19:41 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.131] (HELO eos.apache.org) (140.211.11.131) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Aug 2010 22:19:23 +0000 Received: from eosnew.apache.org (localhost [127.0.0.1]) by eos.apache.org (Postfix) with ESMTP id 717F3823; Tue, 17 Aug 2010 22:19:00 +0000 (UTC) MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Apache Wiki To: Apache Wiki Date: Tue, 17 Aug 2010 22:19:00 -0000 Message-ID: <20100817221900.29898.1055@eosnew.apache.org> Subject: =?utf-8?q?=5BHadoop_Wiki=5D_Update_of_=22HowToUseSystemTestFramework=22_b?= =?utf-8?q?y_KonstantinBoudnik?= X-Virus-Checked: Checked by ClamAV on apache.org Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for ch= ange notification. The "HowToUseSystemTestFramework" page has been changed by KonstantinBoudni= k. http://wiki.apache.org/hadoop/HowToUseSystemTestFramework?action=3Ddiff&rev= 1=3D1&rev2=3D2 -------------------------------------------------- - Herriot (HADOOP-6332) system test development guide is coming soon + =3D System tests development How To =3D = + This document described how to develop cluster based system tests with ne= w Hadoop cluster test framework (code name Herriot). For more information a= bout Herriot visit [[https://issues.apache.org/jira/browse/HADOOP-6332/|HAD= OOP-6332]] + = + Here you can find [[http://people.apache.org/~cos/herriot.docs/index.html= |up-to date javadocs]] of the framework's APIs. + = + =3D=3D=3D=3D Definitions =3D=3D=3D=3D + The following definitions will be used through the guide: + * '''Test client''' a computer/source location where the execution of = tests is initiated + * '''Daemon proxy''' an RPC class which provide access to a remote Had= oop's daemon's (NN, DN, JT, TT) APIs + * '''Cluster proxy''' Herriot class which combines and provides a conv= enient API to control Hadoop cluster from Herriot TestClient + * '''Herriot library''' the combination of above APIs residing on a ''= 'test client''' + = + =3D=3D=3D=3D Test development environment =3D=3D=3D=3D + To develop tests for Herriot you don't need any extra tools. Herriot is e= mbedded into Hadoop source code base. The APIs exposed for test development= are static and present in the form of interfaces. Test framework specific = classes such as *cluster proxy* and so on are available in the form of Java= classes. = + First, clone a git repository and check out latest Hadoop branch: + {{{ + git clone git://github.com/apache/hadoop-hdfs.git hdfs + git checkout -t -b trunk origin/trunk + }}} + For Common and Mapreduce place adjust above command accordingly and chang= e the name of the branch in case in you need a different one. + = + All you need is to make sure that the following source directories are in= cluded to the project's definition of your favorite IDE: = + {{{ + src/test/system/aop + src/test/system/java + src/test/system/test + }}} + The first two are needed only for test framework development. So if your = purpose is Herriot test development you can limit your configuration to the= latter location only. + = + =3D=3D=3D=3D Tests structure =3D=3D=3D=3D + Herriot tests make use of the JUnit4 framework (they may also use !TestNG= if this framework is exposed to Hadoop). JUnit fixtures are used in Herrio= t tests such as `@Before`, `@After` and so on. For our immediate purpose He= rriot tests are JUnit tests. Therefore if you know how to develop JUnit tes= ts you are good to go. = + = + In the current environment tests should be placed under = + {{{ + src/ + test/ + system/ + test/ + [org.apache.hadoop.hdfs|org.apache.hadoop.mapred] + }}} + = + Framework related classes belong to `org.apache.hadoop.test.system` for t= he shared code and/or to `org.apache.hadoop.hdfs.test.system`, `org.apache.= hadoop.mapreduce.test.system` for HDFS and MR specific parts, respectively. + = + =3D=3D=3D=3D Examples =3D=3D=3D=3D + Let's take a look at the real test example available from `src/test/syste= m/test/org/apache/hadoop/mapred/TestCluster.java`. As always your best sour= ce of information and knowledge about any software system is its source cod= e :) = + = + * Let's start with `@BeforeClass` fixture creating an instance of *clu= ster proxy* (in this case for a !MapReduce cluster) which provides access t= o !MapReduce daemons (the Job Tracker [JT] and Task Trackers [TTs]). The se= cond call creates all needed *daemon proxies* and makes them available thro= ugh *Herriot library* APIs. As part of this setup Herriot will guarantee th= at the test environment is clean and all internal states are reset. Also, a= number of exceptions that arise in the daemon logs will be saved. This is = particularly useful as it allows us to disregard exceptions raised in the l= og files before a Herriot test has been started. `@BeforeClass` will guaran= tee that only one instance of *cluster proxy* is created (for this is an ex= pensive operation) for use in all test cases defined in the test class. + = + {{{#!java + @BeforeClass + public static void before() throws Exception { + cluster =3D MRCluster.createCluster(new Configuration()); + cluster.setUp(); + } + }}} + = + * It is easy to submit and verify a !MapReduce job: + {{{#!java + @Test + public void testJobSubmission() throws Exception { + Configuration conf =3D new Configuration(cluster.getConf()); + SleepJob job =3D new SleepJob(); + job.setConf(conf); + conf =3D job.setupJobConf(1, 1, 100, 100, 100, 100); + RunningJob rJob =3D cluster.getJTClient().submitAndVerifyJob(conf); + cluster.getJTClient().verifyJobHistory(rJob.getID()); + } + }}} + The new JT's API call `submitAndVerifyJob(Configuration conf) ` will chec= k if the job has been completed successfully by looking into the job detail= s (e.g. number of maps and reducers), monitoring their progress and success= of job execution, as well as proper cleanup. If some of the conditions are= n't met proper exceptions will be raised. + = + * The following example demonstrates how to modify a cluster's configu= ration and restart the daemons with it. At the end the original the cluster= is restarted with its original configuration. + {{{#!java + @Test + public void testPushConfig() throws Exception { + final String DUMMY_CONFIG_STRING =3D "mapred.newdummy.conf"; + String confFile =3D "mapred-site.xml"; + Hashtable prop =3D new Hashtable(); + prop.put(DUMMY_CONFIG_STRING, 1L); + Configuration daemonConf =3D cluster.getJTClient().getProxy().getDae= monConf(); + Assert.assertTrue("Dummy varialble is expected to be null before rest= art.", + daemonConf.get(DUMMY_CONFIG_STRING) =3D=3D null); + cluster.restartClusterWithNewConfig(prop, confFile); = + Configuration newconf =3D cluster.getJTCeFIG_STRING).equals("1")); + cluster.restart(); + daemonConf =3D cluster.getJTClient().getProxy().getDaemonConf(); + Assert.assertTrue("Dummy variable is expected to be null after restar= t.", + daemonConf.get(DUMMY_CONFIG_STRING) =3D=3D null); + } + }}} + = + The above example also works for Hadoop clusters where DFS and MR are sta= rted under different user accounts. For this to happen multi-user support n= eeds to be enabled in the Herriot's configuration file (see below). + = + * Apparently, it is nice to clean up after you when everything is done: + {{{#!java + @AfterClass + public static void after() throws Exception { + cluster.tearDown(); + } + }}} + = + =3D=3D=3D=3D Tests execution environment =3D=3D=3D=3D + For execution of the tests the test client needs to have: + * access to a cluster which runs instrumented build of Hadoop + * available copies of configuration files from a deployed cluster (loc= ated under `$HADOOP_CONF_DIR` in this example). = + = + No Hadoop binaries are required on the machine where you run tests (there= is a special case of single-node cluster where you need to build instrumen= ted Hadoop before running the tests; see below). Herriot tests are executed= directly from a source code tree by issuing the following ant command: + {{{ + ant test-system -Dhadoop.conf.dir.deployed=3D${HADOOP_CONF_DIR} + }}} + To run just a single test use usual property `-Dtestcase=3Dtestname` + = + Once the test run is complete the results and logs can be found under `bu= ild-fi/system/test` directory. + = + Normally, '''test client''' is expected to be a cluster's gateway. Howeve= r, it should possible to run tests from one's deployment machine, laptop, o= r from a cluster node. + = + 3D"deploy= + = + The following software tools should be available on a *test client*: + * Ant (version 1.7+) + * Java6 + = + In the future a means to run Herriot tests from a jar file may also be pr= ovided. + = + Some of the tests might throw lzcodec exceptions. To address this `build.= xml` will accept new system property `lib.file.path` that needs to be set b= efore testa are run. = + {{{ + ant test-system -Dhadoop.conf.dir.deployed=3D${HADOOP_CONF_DIR} -Dlib.f= ile.path=3D${HADOOP_HOME}/lib/native/Linux-i386-32 + }}} + The feature is [[https://issues.apache.org/jira/browse/MAPREDUCE-1790|pen= ding MAPREDUCE-1790]] = + = + `hadoop-common/src/test/system/scripts/*` has to have executable permissi= ons, otherwise some of the mapreduce test cases will fail. = + = + =3D=3D=3D=3D Settings specific for security environment =3D=3D=3D=3D + To make Herriot protocols trusted in a secure Hadoop environment `hadoop-= policy-system-test.xml` must be included in `hadoop-policy.xml` + = + /* TODO: not yet implemented + Execution specific file is `proxyusers` that needs to be created under `$= {HADOOP_CONF_DIR}`. This file has to contain the username of the users who = can impersonate others. This features only makes sense if security is enabl= ed. + = + To successfully run TestDecommisioning hadoop-policy.xml's property `secu= rity.admin.operations.protocol.acl` should include the test user's group in= the value; mapred-site.xml's property `mapreduce.cluster.administrators` s= hould include test user's group. */ + = + =3D=3D=3D=3D Configuration files =3D=3D=3D=3D + When the Herriot '''test client''' is starting it is looking for a single= configuration file `system-test.xml`. Internally, this file is supplied by= an automated deployment process and should not be a concern for test devel= opers. However, more information will be provided in the deployment section= of this document. + = + =3D=3D=3D=3D Herriot cluster deployment procedure/requirements =3D=3D=3D= =3D + Herriot configuration files are located in `src/test/system/config` + = + The framework uses its own custom interfaces for RPC control interaction = with remote daemons. These are based on the standard Hadoop RPC mechanism. = However, these are designed to not interfere with normal Hadoop traffic. On= DFS side no additional configuration is needed - the framework will derive= all needed information from the existing configuration files. However, to = allow connectivity with !TaskTrackers the following property needs to be ad= ded to `mapred-site.xml`: + {{{ + + mapred.task.tracker.report.address + 0.0.0.0:50030 + true + + }}} + = + This configures an extra RPC port thus enabling direct communication with= TTs that isn't normally available. + = + The content of `system-test.xml` needs to be customized during installati= on according to the macros defined in the file. + = + Some modification is required to enable multi-user support on the *test c= lient* side. Herriot distribution provides a special binary which allows se= tuid execution. The source code for this tool is located under `src/test/sy= stem/c++/runAs`. As an additional security guarantee the binary must have t= he $HADOOP_HOME environment variable defined properly at compile time. This= reduces the risk of malicious use of the framework. To compile it use the = following ant target + {{{ + ant run-as -Drun-as.hadoop.home.dir=3D + }}} + As the final step, the binary has to be installed into the `test.system.h= drc.multi-user.binary.path` specified in `system-test.xml`. Configure setui= d permissions with + {{{ + chown root ${test.system.hdrc.multi-user.binary.path}/runAs + chmod 6511 ${test.system.hdrc.multi-user.binary.path}/runAs + }}} + = + =3D=3D=3D=3D Single-node cluster installation =3D=3D=3D=3D + It is convenient to be able to run a cluster on one's desktop and to be a= ble to execute the same cluster tests in your own environment. It is possib= le. The following steps need to be taken + {{{ + % cd $WORKSPACE + % ant binary-system + % export HADOOP_HOME=3D$WORKSPACE/build-fi/systdoop-$version-SNAPHOT + % export HADOOP_CONF_DIR=3D + % cd $HADOOP_HOME + % chmod +x bin/* + % ./bin/start-all.sh + % cd $WORKSPACE + % ant test-system -Dhadoop.conf.dir.deployed=3D$HADOOP_CONF_DIR + }}} + Edit `src/test/system/conf/system-test.xml` file and put it under `$HADOO= P_CONF_DIR` +=20