Return-Path: Delivered-To: apmail-hadoop-hbase-issues-archive@minotaur.apache.org Received: (qmail 46619 invoked from network); 8 Apr 2010 05:49:02 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 8 Apr 2010 05:49:02 -0000 Received: (qmail 9090 invoked by uid 500); 8 Apr 2010 05:49:02 -0000 Delivered-To: apmail-hadoop-hbase-issues-archive@hadoop.apache.org Received: (qmail 9058 invoked by uid 500); 8 Apr 2010 05:49:02 -0000 Mailing-List: contact hbase-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hbase-issues@hadoop.apache.org Received: (qmail 9050 invoked by uid 99); 8 Apr 2010 05:49:01 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Apr 2010 05:49:01 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Apr 2010 05:48:58 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 961CA234C48D for ; Thu, 8 Apr 2010 05:48:36 +0000 (UTC) Message-ID: <174500112.6651270705716609.JavaMail.jira@brutus.apache.org> Date: Thu, 8 Apr 2010 05:48:36 +0000 (UTC) From: "Todd Lipcon (JIRA)" To: hbase-issues@hadoop.apache.org Subject: [jira] Commented: (HBASE-2341) Suite of test scripts that a.) load a cluster with a verifiable dataset and b.) do random kills of regionserver+datanodes in small cluster MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854822#action_12854822 ] Todd Lipcon commented on HBASE-2341: ------------------------------------ I've started work on some python based fault injections here: http://github.com/toddlipcon/gremlins The work is very preliminary, and I plan on continuing to develop it over the next couple of weeks, but would be happy to have other people contribute. Once it's reached a more thorough state we could look at including it right in the HBase source, though it's generally useful so I plan to keep it on github as well. > Suite of test scripts that a.) load a cluster with a verifiable dataset and b.) do random kills of regionserver+datanodes in small cluster > ------------------------------------------------------------------------------------------------------------------------------------------ > > Key: HBASE-2341 > URL: https://issues.apache.org/jira/browse/HBASE-2341 > Project: Hadoop HBase > Issue Type: Task > Reporter: stack > Fix For: 0.20.5, 0.21.0 > > Attachments: count-slaves.rb, HBASE-2341-0.20.3.patch, test.sh, VerifiableEditor.java, VerifiableEditor.java > > > We just filed hbase-2340 but discussion up on irc has it that we need something more hardcore than pussy-footing inside a single jvm as hdfs-2340 does. The point was made (tlipcon) that its hard to ensure real recovery working if all is in the one JVM. > So, this issue is about scripts that can: > + load a cluster with a dataset that we can 'verify' as in we can tell if it has holes in it, if data has been lost. > + script that does random kill of a random node on some random occasion > + Script that can check cluster for data loss > All above should work while cluster is under load. > The above would not sit under junit. > This looks like a suite that we'd want to run up in ec2 using Andrew's scripts and our donated aws credits. > {code} > 16:12 < tlipcon> here's my goal: we have a 5 node cluster in the back room. I want to run hbase on that at near full load for a week straight while some process goes around screwing with it > 16:12 < tlipcon> then I want to verify that I didn't lose a single edit over that week > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.