Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BA2B37DE5 for ; Wed, 21 Dec 2011 18:45:53 +0000 (UTC) Received: (qmail 87595 invoked by uid 500); 21 Dec 2011 18:45:53 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 87552 invoked by uid 500); 21 Dec 2011 18:45:53 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 87544 invoked by uid 500); 21 Dec 2011 18:45:53 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 87541 invoked by uid 99); 21 Dec 2011 18:45:53 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Dec 2011 18:45:53 +0000 X-ASF-Spam-Status: No, hits=-2002.5 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Dec 2011 18:45:51 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id EE6E71206BB for ; Wed, 21 Dec 2011 18:45:30 +0000 (UTC) Date: Wed, 21 Dec 2011 18:45:30 +0000 (UTC) From: "Alan Gates (Commented) (JIRA)" To: hive-dev@hadoop.apache.org Message-ID: <1966077700.36418.1324493130978.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <273053985.36291.1324491330887.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HIVE-2670) A cluster test utility for Hive MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174292#comment-13174292 ] Alan Gates commented on HIVE-2670: ---------------------------------- Attached a first patch. This is not ready for inclusion yet, I'm just putting it up here to start getting feedback. The following will need to be resolved before it is checked in: # Currently it just has the base harness code included as a tar file. This really should be externed from the Pig code base, as HCatalog does. # I don't know if this is the right place in SVN or not. I put it all in a test-e2e directory right under trunk. I need feedback on whether this is a good spot or somewhere else would be preferred. # Connect the top level build.xml to this so it is possible to invoke the tests from the top level directory. I was waiting to do this until I had feedback on the proper directory structure. How to use it: After applying the patch you will need to copy the harness.tar file (attached) to test-e2e, since that is not done for you by the patch tool. First you need an existing Hadoop cluster (it can be very small, just a few nodes) and a MySQL database. I ran my tests against Hadoop 0.20.205.0, but this should run against any 0.20.x version of Hadoop. Then: # Run the script test-e2e/scripts/create_test_db.sql against your MySQL database as a user that can create users and databases, and grant to users (root is a good choice) # Run "ant package" in the top level Hive directory # cd test-e2e # ant -Dharness.hadoop.home= -Dharness.hive.home= deploy # ant -Dharness.hadoop.home= -Dharness.hive.home= deploy Usually will be $CWD/../build/dist The basic design of this test harness is each test consists of three phases: run_test, generate_benchmark, and compare_results. In run_test a particular test is run. generate_benchmark runs the same or a similar test against a known source of truth. compare_results then compares the results and declares the test to have succeeded, failed, or aborted. The harness delegates each of these three functions to drivers that are specific to different types of tests. This patch includes two drivers, a Hive driver and a Hive command line driver. The Hive driver uses the MySQL database as a source of truth. Each SQL script is run against Hive and against MySQL and the results compared using the Unix cksum tool. For more information on the test harness, including how to add tests to it, see https://cwiki.apache.org/confluence/display/PIG/HowToTest The Hive driver does not yet support running alternate SQL for benchmarking nor using an old version of Hive for the benchmarks, though those should be added sometime. > A cluster test utility for Hive > ------------------------------- > > Key: HIVE-2670 > URL: https://issues.apache.org/jira/browse/HIVE-2670 > Project: Hive > Issue Type: New Feature > Components: Testing Infrastructure > Reporter: Alan Gates > Attachments: harness.tar, hive_cluster_test.patch > > > Hive has an extensive set of unit tests, but it does not have an infrastructure for testing in a cluster environment. Pig and HCatalog have been using a test harness for cluster testing for some time. We have written Hive drivers and tests to run in this harness. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira