Return-Path: Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: (qmail 64634 invoked from network); 19 Nov 2010 20:02:07 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 19 Nov 2010 20:02:07 -0000 Received: (qmail 58770 invoked by uid 500); 19 Nov 2010 20:02:39 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 58738 invoked by uid 500); 19 Nov 2010 20:02:39 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 58730 invoked by uid 99); 19 Nov 2010 20:02:39 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 19 Nov 2010 20:02:39 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 19 Nov 2010 20:02:35 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id oAJK2DU9023568 for ; Fri, 19 Nov 2010 20:02:14 GMT Message-ID: <26183142.202691290196933851.JavaMail.jira@thor> Date: Fri, 19 Nov 2010 15:02:13 -0500 (EST) From: "Ted Yu (JIRA)" To: issues@hbase.apache.org Subject: [jira] Commented: (HBASE-3251) HConnectionManager.listTables() doesn't return broken tables In-Reply-To: <6699945.201801290195133911.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933942#action_12933942 ] Ted Yu commented on HBASE-3251: ------------------------------- Alternatively, HMaster.deleteTable() should be able to detect the dangling row in .META. and delete it. > HConnectionManager.listTables() doesn't return broken tables > ------------------------------------------------------------ > > Key: HBASE-3251 > URL: https://issues.apache.org/jira/browse/HBASE-3251 > Project: HBase > Issue Type: Bug > Components: regionserver > Affects Versions: 0.20.6 > Reporter: Ted Yu > > We saw this in our integration test log - packageindex table was 'broekn': > {code} > 2010-11-19 05:12:42,216 Thread-20 ERROR [StripedHBaseTable] Could not create packageindex > org.apache.hadoop.hbase.TableExistsException: org.apache.hadoop.hbase.TableExistsException: packageindex > at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:799) > at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:763) > at sun.reflect.GeneratedMethodAccessor25.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657) > at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:998) > ... > 2010-11-19 05:12:42,218 Thread-20 INFO [HBasePackageIndexTableMapperNew] Creating table packageindex - Done > 2010-11-19 05:12:42,235 Thread-20 INFO [CodecPool] Got brand-new decompressor > 2010-11-19 05:12:42,262 Thread-20 INFO [HBasePackageIndexTableMapperNew] OnClose called > 2010-11-19 05:12:42,263 Thread-20 WARN [LocalJobRunner] job_local_0001 > org.apache.hadoop.hbase.TableNotFoundException: packageindex > at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:698) > at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:634) > at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:601) > at org.apache.hadoop.hbase.client.HTable.(HTable.java:134) > at org.apache.hadoop.hbase.client.HTable.(HTable.java:112) > {code} > In HConnectionManager.listTables(): > {code} > byte[] value = result.getValue(CATALOG_FAMILY, > REGIONINFO_QUALIFIER); > HRegionInfo info = null; > if (value != null) { > info = Writables.getHRegionInfo(value); > } > // Only examine the rows where the startKey is zero length > if (info != null && info.getStartKey().length == 0) { > uniqueTables.add(info.getTableDesc()); > } > {code} > For a broken table, there would be a row in .META (see below). but the table wouldn't be included in uniqueTables. > We need a way for listTables() to mark the broken table and return it so that master.jsp can show the table in prominent way. > {code} > packageindex,E70888DD48276D column=info:regioninfo, timestamp=1290188566363, value=REGION => {NAME => 'packag > FAD4D26FEB08DC7045,12901630 eindex,E70888DD48276DFAD4D26FEB08DC7045,1290163034864', STARTKEY => 'E70888DD4827 > 34864 6DFAD4D26FEB08DC7045', ENDKEY => 'E83A8362462AF0D097810F96ED7103C2', ENCODED => 2 > 080544777, OFFLINE => true, TABLE => {{NAME => 'packageindex', FAMILIES => [{NAME > => 'i', COMPRESSION => 'GZ', VERSIONS => '1', TTL => '31536000', BLOCKSIZE => '6 > 5536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'u', COMPRESSION => > 'GZ', VERSIONS => '1', TTL => '31536000', BLOCKSIZE => '65536', IN_MEMORY => 'fal > se', BLOCKCACHE => 'true'}]}} > {code} > Here is what led to broken table in our cluster. > 2010-11-19 12:49:23,067 main INFO [PackageIndexTableTest] > [10:57am] tyu: Deleting packageindex content ... > From hbase-hadoop-regionserver-us01-ciqps1-grid05.ciq.com.log: > {code} > 2010-11-19 12:49:41,119 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Caches flushed, doing commit now (which includes update scanners) > 2010-11-19 12:49:41,121 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~16.0k for region .META.,,1 in 83ms, sequence id=48465684, compaction requested=true > 2010-11-19 12:49:41,121 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region .META.,,1/1028785192 because: regionserver/10.202.50.105:60020.cacheFlusher > 2010-11-19 12:54:11,353 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner 6416258050001207387 lease expired > 2010-11-19 12:54:11,353 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes: Total=633.5422MB (664317096), Free=161.05789MB (168881432), Max=794.60004MB (833198528), Counts: Blocks=75276, Access=90464772, Hit=86854034, Miss=3610738, Evictions=222, Evicted=2121113, Ratios: Hit Ratio=96.00868225097656%, Miss Ratio=3.9913196116685867%, Evicted/Run=9554.5634765625 > 2010-11-19 12:54:11,353 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 269945ms, ten times longer than scheduled: 10000 > 2010-11-19 12:54:11,353 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes: Total=633.5422MB (664317096), Free=161.05789MB (168881432), Max=794.60004MB (833198528), Counts: Blocks=75276, Access=90464772, Hit=86854034, Miss=3610738, Evictions=222, Evicted=2121113, Ratios: Hit Ratio=96.00868225097656%, Miss Ratio=3.9913196116685867%, Evicted/Run=9554.5634765625 > 2010-11-19 12:54:11,353 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -1270857692790249130 lease expired > 2010-11-19 12:54:11,354 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes: Total=633.5422MB (664317096), Free=161.05789MB (168881432), Max=794.60004MB (833198528), Counts: Blocks=75276, Access=90464772, Hit=86854034, Miss=3610738, Evictions=222, Evicted=2121113, Ratios: Hit Ratio=96.00868225097656%, Miss Ratio=3.9913196116685867%, Evicted/Run=9554.5634765625 > ... > 2010-11-19 12:54:11,354 WARN org.apache.zookeeper.ClientCnxn: Exception closing session 0x12c520945433100 to sun.nio.ch.SelectionKeyImpl@78317d11 > java.io.IOException: TIMED OUT > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:906) > 2010-11-19 12:54:11,391 WARN org.apache.zookeeper.ClientCnxn: Exception closing session 0x12c520945432f46 to sun.nio.ch.SelectionKeyImpl@727d3468 > java.io.IOException: TIMED OUT > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:906) > 2010-11-19 12:54:11,354 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: > org.apache.hadoop.hbase.UnknownScannerException: Name: 6416258050001207387 on us01-ciqps1-grid05.carrieriq.com,60020,1290003836762 > at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1885) > at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657) > at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:998) > 2010-11-19 12:54:11,354 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: > org.apache.hadoop.hbase.UnknownScannerException: Name: -1270857692790249130 on us01-ciqps1-grid05.carrieriq.com,60020,1290003836762 > at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1885) > at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1873) > at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657) > at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:998) > 2010-11-19 12:54:11,354 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: unable to report to master for 270306 milliseconds - retrying > 2010-11-19 12:54:11,415 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 16 on 60020, call next(6416258050001207387, 100) from 10.202.36.42:37477: error: org.apache.hadoop.hbase.UnknownScannerException: Name: 6416258050001207387 on us01-ciqps1-grid05.carrieriq.com,60020,1290003836762 > org.apache.hadoop.hbase.UnknownScannerException: Name: 6416258050001207387 on us01-ciqps1-grid05.carrieriq.com,60020,1290003836762 > at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1885) > at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657) > at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:998) > ... > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.