Return-Path: Delivered-To: apmail-hadoop-hbase-dev-archive@minotaur.apache.org Received: (qmail 19246 invoked from network); 22 Feb 2010 20:52:50 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 22 Feb 2010 20:52:50 -0000 Received: (qmail 23175 invoked by uid 500); 22 Feb 2010 20:52:50 -0000 Delivered-To: apmail-hadoop-hbase-dev-archive@hadoop.apache.org Received: (qmail 23154 invoked by uid 500); 22 Feb 2010 20:52:50 -0000 Mailing-List: contact hbase-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-dev@hadoop.apache.org Delivered-To: mailing list hbase-dev@hadoop.apache.org Received: (qmail 23144 invoked by uid 99); 22 Feb 2010 20:52:49 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 22 Feb 2010 20:52:49 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 22 Feb 2010 20:52:49 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 5603E234C4AF for ; Mon, 22 Feb 2010 12:52:29 -0800 (PST) Message-ID: <1537587004.441111266871949351.JavaMail.jira@brutus.apache.org> Date: Mon, 22 Feb 2010 20:52:29 +0000 (UTC) From: "stack (JIRA)" To: hbase-dev@hadoop.apache.org Subject: [jira] Updated: (HBASE-2244) META gets inconsistent in a number of crash scenarios In-Reply-To: <564791144.407771266631768235.JavaMail.jira@brutus.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-2244: ------------------------- Attachment: 2244.patch Patch that skips opening of new daughter regions post creation. Its not necessary. Should make splits run a little faster and take a little load off hdfs/nn. > META gets inconsistent in a number of crash scenarios > ----------------------------------------------------- > > Key: HBASE-2244 > URL: https://issues.apache.org/jira/browse/HBASE-2244 > Project: Hadoop HBase > Issue Type: Bug > Reporter: Kannan Muthukkaruppan > Assignee: stack > Priority: Critical > Fix For: 0.20.4 > > Attachments: 2244.patch > > > (Forking this issue off from HBASE-2235). > During load testing, in a number of failure scenarios (unexpected region server deaths) etc., we notice that META can get inconsistent. This primarily happens for regions which are in the process of being split. Manually running add_table.rb seems to fix the tables meta data just fine. > But it would be good to do automatic cleansing (as part of META scanners work) and/or avoid these inconsistent states altogether. > For example, for a particular startkey, I see all these entries: > {code} > test1,1204765,1266569946560 column=info:regioninfo, timestamp=1266581302018, value=REGION => {NAME => 'test1, > 1204765,1266569946560', STARTKEY => '1204765', ENDKEY => '1441091', ENCODED => 18 > 19368969, OFFLINE => true, SPLIT => true, TABLE => {{NAME => 'test1', FAMILIES => > [{NAME => 'actions', VERSIONS => '3', COMPRESSION => 'NONE', TTL => '2147483647' > , BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}} > test1,1204765,1266569946560 column=info:server, timestamp=1266570029133, value=10.129.68.212:60020 > test1,1204765,1266569946560 column=info:serverstartcode, timestamp=1266570029133, value=1266562597546 > test1,1204765,1266569946560 column=info:splitB, timestamp=1266581302018, value=\x00\x071441091\x00\x00\x00\x0 > 1\x26\xE6\x1F\xDF\x27\x1Btest1,1290703,1266581233447\x00\x071290703\x00\x00\x00\x > 05\x05test1\x00\x00\x00\x00\x00\x02\x00\x00\x00\x07IS_ROOT\x00\x00\x00\x05false\x > 00\x00\x00\x07IS_META\x00\x00\x00\x05false\x00\x00\x00\x01\x07\x07actions\x00\x00 > \x00\x07\x00\x00\x00\x0BBLOOMFILTER\x00\x00\x00\x05false\x00\x00\x00\x0BCOMPRESSI > ON\x00\x00\x00\x04NONE\x00\x00\x00\x08VERSIONS\x00\x00\x00\x013\x00\x00\x00\x03TT > L\x00\x00\x00\x0A2147483647\x00\x00\x00\x09BLOCKSIZE\x00\x00\x00\x0565536\x00\x00 > \x00\x09IN_MEMORY\x00\x00\x00\x05false\x00\x00\x00\x0ABLOCKCACHE\x00\x00\x00\x04t > rueh\x0FQ\xCF > test1,1204765,1266581233447 column=info:regioninfo, timestamp=1266609172177, value=REGION => {NAME => 'test1, > 1204765,1266581233447', STARTKEY => '1204765', ENDKEY => '1290703', ENCODED => 13 > 73493090, OFFLINE => true, SPLIT => true, TABLE => {{NAME => 'test1', FAMILIES => > [{NAME => 'actions', VERSIONS => '3', COMPRESSION => 'NONE', TTL => '2147483647' > , BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}} > test1,1204765,1266581233447 column=info:server, timestamp=1266604768670, value=10.129.68.213:60020 > test1,1204765,1266581233447 column=info:serverstartcode, timestamp=1266604768670, value=1266562597511 > test1,1204765,1266581233447 column=info:splitA, timestamp=1266609172177, value=\x00\x071226169\x00\x00\x00\x0 > 1\x26\xE7\xCA,\x7D\x1Btest1,1204765,1266609171581\x00\x071204765\x00\x00\x00\x05\ > x05test1\x00\x00\x00\x00\x00\x02\x00\x00\x00\x07IS_ROOT\x00\x00\x00\x05false\x00\ > x00\x00\x07IS_META\x00\x00\x00\x05false\x00\x00\x00\x01\x07\x07actions\x00\x00\x0 > 0\x07\x00\x00\x00\x0BBLOOMFILTER\x00\x00\x00\x05false\x00\x00\x00\x0BCOMPRESSION\ > x00\x00\x00\x04NONE\x00\x00\x00\x08VERSIONS\x00\x00\x00\x013\x00\x00\x00\x03TTL\x > 00\x00\x00\x0A2147483647\x00\x00\x00\x09BLOCKSIZE\x00\x00\x00\x0565536\x00\x00\x0 > 0\x09IN_MEMORY\x00\x00\x00\x05false\x00\x00\x00\x0ABLOCKCACHE\x00\x00\x00\x04true > \xB9\xBD\xFEO > test1,1204765,1266581233447 column=info:splitB, timestamp=1266609172177, value=\x00\x071290703\x00\x00\x00\x0 > 1\x26\xE7\xCA,\x7D\x1Btest1,1226169,1266609171581\x00\x071226169\x00\x00\x00\x05\ > x05test1\x00\x00\x00\x00\x00\x02\x00\x00\x00\x07IS_ROOT\x00\x00\x00\x05false\x00\ > x00\x00\x07IS_META\x00\x00\x00\x05false\x00\x00\x00\x01\x07\x07actions\x00\x00\x0 > 0\x07\x00\x00\x00\x0BBLOOMFILTER\x00\x00\x00\x05false\x00\x00\x00\x0BCOMPRESSION\ > x00\x00\x00\x04NONE\x00\x00\x00\x08VERSIONS\x00\x00\x00\x013\x00\x00\x00\x03TTL\x > 00\x00\x00\x0A2147483647\x00\x00\x00\x09BLOCKSIZE\x00\x00\x00\x0565536\x00\x00\x0 > 0\x09IN_MEMORY\x00\x00\x00\x05false\x00\x00\x00\x0ABLOCKCACHE\x00\x00\x00\x04true > \xE1\xDF\xF8p > test1,1204765,1266609171581 column=info:regioninfo, timestamp=1266609172212, value=REGION => {NAME => 'test1, > 1204765,1266609171581', STARTKEY => '1204765', ENDKEY => '1226169', ENCODED => 21 > 34878372, TABLE => {{NAME => 'test1', FAMILIES => [{NAME => 'actions', VERSIONS = > > '3', COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMOR > Y => 'false', BLOCKCACHE => 'true'}]}} > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.