hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4821) A fully automated comprehensive distributed integration test for HBase
Date Tue, 03 Apr 2012 17:38:26 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245528#comment-13245528
] 

Keith Turner commented on HBASE-4821:
-------------------------------------

I am an Accumulo developer, there is some cruft in our test dir.  The two most successful
cluster test we have are continuous ingest and random walk.  We have found lots of bugs w/
these test.  I wrote a Gora version of continuous ingest that should run against HBASE.  The
readme on github has a nice description.  

  https://github.com/keith-turner/goraci/

The accumulo version of continuous ingest can be found here.

  http://svn.apache.org/repos/asf/accumulo/tags/1.4.0/test/system/continuous/

This dir contains an old set of open office slides that also give an overview of continuous
ingest.  At the end of the slides is the beginning of the idea of random walk test.  I am
not sure if we have a nice description of random walk anywhere.  It is a fairly simple test
framework.  You write test nodes in Java and link the nodes together in a graph using XML.
 You start a test clients each node in a cluster.  The test client just does a random walk
of the test graph.  We have found a ton of bugs in 1.3 and 1.4 using random walk.  

Actually the Accumulo features page may be the only place we give an overview of randomwalk.
 I noticed that our random walk readme only tells you how to run it, not what it is.  Below
is a link to the random walk test, but like I said its not very informative.

  http://svn.apache.org/repos/asf/accumulo/tags/1.4.0/test/system/randomwalk/

The actual Java code at the link below.  The framework and test nodes code is all here.

  http://svn.apache.org/repos/asf/accumulo/tags/1.4.0/src/server/src/main/java/org/apache/accumulo/server/test/randomwalk/

The short description of randomwalk I mentioned is here.

  http://accumulo.apache.org/notable_features.html#testing

If anyone is interested in generalizing random walk so that HBase could use it to, let me
know.

One last thing.  We tested Accumulo for over a month on a 10 node cluster using Continuous
ingest, Random Walk, and the Agitator.  Below are some of the bugs we found during that time
period.

[Bugs found in 1.4 testing|https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true&jqlQuery=labels+%3D+14_qa_bug]
                
> A fully automated comprehensive distributed integration test for HBase
> ----------------------------------------------------------------------
>
>                 Key: HBASE-4821
>                 URL: https://issues.apache.org/jira/browse/HBASE-4821
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Mikhail Bautin
>            Assignee: Mikhail Bautin
>            Priority: Critical
>
> To properly verify that a particular version of HBase is good for production deployment
we need a better way to do real cluster testing after incremental changes. Running unit tests
is good, but we also need to deploy HBase to a cluster, run integration tests, load tests,
Thrift server tests, kill some region servers, kill the master, and produce a report. All
of this needs to happen in 20-30 minutes with minimal manual intervention. I think this way
we can combine agile development with high stability of the codebase. I am envisioning a high-level
framework written in a scripting language (e.g. Python) that would abstract external operations
such as "deploy to test cluster", "kill a particular server", "run load test A", "run load
test B" (we already have a few kinds of load tests implemented in Java, and we could write
a Thrift load test in Python). This tool should also produce intermediate output, allowing
to catch problems early and restart the test.
> No implementation has yet been done. Any ideas or suggestions are welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message