Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 16895 invoked from network); 19 Feb 2009 06:46:09 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 19 Feb 2009 06:46:09 -0000 Received: (qmail 78833 invoked by uid 500); 19 Feb 2009 06:46:08 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 78799 invoked by uid 500); 19 Feb 2009 06:46:08 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 78788 invoked by uid 99); 19 Feb 2009 06:46:08 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Feb 2009 22:46:08 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [131.159.0.3] (HELO mail-out1.informatik.tu-muenchen.de) (131.159.0.3) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Feb 2009 06:45:59 +0000 Received: from [131.159.16.11] (notekemper28.informatik.tu-muenchen.de [131.159.16.11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.in.tum.de (Postfix) with ESMTP id 02557BE4F for ; Thu, 19 Feb 2009 07:45:37 +0100 (CET) Subject: Re: Re : Re: Strange bug split a table in two From: Michael Seibold To: hbase-user@hadoop.apache.org In-Reply-To: <7c962aed0902181050q7c47fb7arad463ef268c06161@mail.gmail.com> References: <611354200902180757v4c9b9aa0t1af889336434f715@mail.gmail.com> <7c962aed0902180938h32afeab6q509ae8e073e52577@mail.gmail.com> <9a92368f548e7028.499c627b@ina.fr> <7c962aed0902181050q7c47fb7arad463ef268c06161@mail.gmail.com> Content-Type: text/plain; charset="UTF-8" Date: Thu, 19 Feb 2009 07:45:36 +0100 Message-Id: <1235025936.8192.3.camel@notekemper28> Mime-Version: 1.0 X-Mailer: Evolution 2.24.3 Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV using ClamSMTP X-Virus-Checked: Checked by ClamAV on apache.org Hi, I also have experienced similar issues with hbase 0.19.0 when inserting lots of data with the java client. 1. I keep adding data to region X 2. an automatic region split for X is started 3. I keep adding data to region X 4. an second automatic region split is started before the first automatic region split is finished -> 1. Crash of HBase regionserver 2. Crash of HBase master 3. Crash of client Kind regards, Michael El mié, 18-02-2009 a las 10:50 -0800, stack escribió: > If it doesn't work -- even during intense writing -- its a bug. > St.Ack > > > On Wed, Feb 18, 2009 at 10:33 AM, wrote: > > > Is it possible to request split or compaction during intensive write ? > > > > > > > > ----- Message d'origine ----- > > De: stack > > Date: Mercredi, Février 18, 2009 6:38 pm > > Objet: Re: Strange bug split a table in two > > > > > Jérôme: > > > > > > Which version of hbase? > > > > > > Enable DEBUG. See FAQ for how. Have you read the getting started > > > where it > > > suggests you up the file descriptors? See also end of the > > > troubleshootingpage for hadoop config. needed for hbase. > > > > > > How big are your tables? How many rows/regions? > > > > > > St.Ack > > > > > > > > > On Wed, Feb 18, 2009 at 7:57 AM, Jérôme Thièvre INA > > > wrote: > > > > > > > Hi, > > > > > > > > > > > > During batch insertion of rows with java client in a table, I have > > > > requested > > > > a split of this table with the HBase web interface. > > > > The insertion process started to slowdown, and I think it's > > > normal, but > > > > then > > > > it stopped with no exception. > > > > > > > > So I stopped the hbase cluster with bin/stop-hbase.sh and every > > > region> server stopped normally (I don't kill any process). > > > > > > > > I take a look at the logs : > > > > > > > > *master logs firest exceptions : > > > > > > > > *2009-02-18 15:48:27,969 INFO > > > org.apache.hadoop.hbase.master.ServerManager:> Received > > > MSG_REPORT_SPLIT: metadata_table,r: > > > > http://net.series-tv.www/index.php?showforum=197,1234542589092: > > > > metadata_table,r: > > > > http://net.series-tv.www/index.php?showforum=197,1234542589092 > > > split;> daughters: metadata_table,r: > > > > http://net.series-tv.www/index.php?showforum=197,1234968484302, > > > > metadata_table,r: > > > > http://net.series-tv.www/index.php?showtopic=6973,1234968484302 from > > > > 10.1.188.16:60020 > > > > 2009-02-18 15:48:27,969 INFO > > > org.apache.hadoop.hbase.master.RegionManager:> assigning region > > > metadata_table,r:> http://net.series- > > > tv.www/index.php?showtopic=6973,1234968484302 to server > > > > 10.1.188.16:60020 > > > > 2009-02-18 15:48:27,970 INFO > > > org.apache.hadoop.hbase.master.RegionManager:> assigning region > > > metadata_table,r:> http://net.series- > > > tv.www/index.php?showforum=197,1234968484302 to server > > > > 10.1.188.16:60020 > > > > 2009-02-18 15:48:29,555 INFO > > > org.apache.hadoop.hbase.master.ServerManager:> Received > > > MSG_REPORT_PROCESS_OPEN: metadata_table,r: > > > > > > > > http://fr.weborama.pro/fcgi- > > > > > bin/comptage.fcgi?ID=175809&MEDIA=MAIL&PAGE=1&ZONE=50000,1234968501145from> > > 10.1.188.179:60020 > > > > 2009-02-18 15:48:29,555 INFO > > > org.apache.hadoop.hbase.master.ServerManager:> Received > > > MSG_REPORT_OPEN: metadata_table,r: > > > > > > > > > > > > > http://info.sirti.www/spip.php?id_article=320&page=galerie2,1234968501145from> > > 10.1.188.179:60020 > > > > 2009-02-18 15:48:29,555 INFO > > > > org.apache.hadoop.hbase.master.ProcessRegionOpen$1: > > > metadata_table,r:> > > > > > > > > > http://info.sirti.www/spip.php?id_article=320&page=galerie2,1234968501145open> > > on > > > > 10.1.188.179:60020 > > > > 2009-02-18 15:48:29,555 INFO > > > > org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row > > > > metadata_table,r: > > > > > > > > > http://info.sirti.www/spip.php?id_article=320&page=galerie2,1234968501145in> > > region .META.,,1 with startcode 1234946982368 and server > > > > 10.1.188.179:60020 > > > > 2009-02-18 15:48:30,994 INFO > > > org.apache.hadoop.hbase.master.ServerManager:> Received > > > MSG_REPORT_PROCESS_OPEN: metadata_table,r: > > > > http://net.series-tv.www/index.php?showforum=197,1234968484302 from > > > > 10.1.188.16:60020 > > > > 2009-02-18 15:48:30,995 INFO > > > org.apache.hadoop.hbase.master.ServerManager:> Received > > > MSG_REPORT_OPEN: metadata_table,r: > > > > http://net.series-tv.www/index.php?showtopic=6973,1234968484302 from > > > > 10.1.188.16:60020 > > > > 2009-02-18 15:48:30,995 INFO > > > > org.apache.hadoop.hbase.master.ProcessRegionOpen$1: > > > metadata_table,r:> http://net.series- > > > tv.www/index.php?showtopic=6973,1234968484302 open on > > > > 10.1.188.16:60020 > > > > 2009-02-18 15:48:30,995 INFO > > > > org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row > > > > metadata_table,r: > > > > http://net.series-tv.www/index.php?showtopic=6973,1234968484302 > > > in region > > > > .META.,,1 with startcode 1234946972127 and server 10.1.188.16:60020 > > > > 2009-02-18 15:48:40,006 INFO > > > org.apache.hadoop.hbase.master.ServerManager:> Received > > > MSG_REPORT_CLOSE: metadata_table,r: > > > > http://net.series-tv.www/index.php?showforum=197,1234968484302: > > > > java.io.IOException: Could not obtain block: blk_- > > > 6029004777792863005_53535> > > > > > > > > > file=/hbase/metadata_table/1933533649/location/info/912096781946009771.309611126> > > at > > > > > > > > > > > > > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1708)> > > at > > > > > > > > > > > > > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1536)> > > at > > > > > > > > > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1663)> > > at > > > > > > > > > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1593)> > > at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:320) > > > > at java.io.DataInputStream.readUTF(DataInputStream.java:572) > > > > at java.io.DataInputStream.readUTF(DataInputStream.java:547) > > > > at > > > org.apache.hadoop.hbase.io.Reference.readFields(Reference.java:105)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HStoreFile.readSplitInfo(HStoreFile.java:295)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HStore.loadHStoreFiles(HStore.java:436)> > > at org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:230) > > > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1764)> > > at > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:276)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateRegion(HRegionServer.java:1367)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:1338)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:1253)> > > at java.lang.Thread.run(Thread.java:619) > > > > from 10.1.188.16:60020 > > > > 2009-02-18 15:48:42,681 INFO > > > org.apache.hadoop.hbase.master.RegionManager:> assigning region > > > metadata_table,r:> http://net.series- > > > tv.www/index.php?showforum=197,1234968484302 to server > > > > 10.1.188.149:60020 > > > > 2009-02-18 15:48:44,580 INFO > > > org.apache.hadoop.hbase.master.ServerManager:> Received > > > MSG_REPORT_CLOSE: metadata_table,r: > > > > > > > > http://fr.weborama.pro/fcgi- > > > bin/comptage.fcgi?ID=175809&MEDIA=MAIL&PAGE=1&ZONE=50000,1234968501145> : > > > > java.io.IOException: Could not obtain block: > > > blk_1599510651183165167_53487> > > > > > > > > > file=/hbase/metadata_table/1127743078/type/info/5407628626802748081.1381909621> > > at > > > > > > > > > > > > > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1708)> > > at > > > > > > > > > > > > > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1536)> > > at > > > > > > > > > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1663)> > > at > > > > > > > > > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1593)> > > at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:320) > > > > at java.io.DataInputStream.readUTF(DataInputStream.java:572) > > > > at java.io.DataInputStream.readUTF(DataInputStream.java:547) > > > > at > > > org.apache.hadoop.hbase.io.Reference.readFields(Reference.java:105)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HStoreFile.readSplitInfo(HStoreFile.java:295)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HStore.loadHStoreFiles(HStore.java:436)> > > at org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:230) > > > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1764)> > > at > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:276)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateRegion(HRegionServer.java:1367)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:1338)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:1253)> > > at java.lang.Thread.run(Thread.java:619) > > > > from 10.1.188.179:60020 > > > > * > > > > And after few exception on differents regions : > > > > > > > > *009-02-18 15:49:29,955 WARN > > > org.apache.hadoop.hbase.master.BaseScanner:> Scan one META region: > > > {regionname: .META.,,1, startKey: <>, server: > > > > 10.1.188.16:60020} > > > > java.io.IOException: java.io.IOException: HStoreScanner failed > > > construction> at > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.StoreFileScanner.(StoreFileScanner.java:70)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HStoreScanner.(HStoreScanner.java:88)> > > at > > > > > > > org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:2125)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegion$HScanner.(HRegion.java:1989)> > > at > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1180)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1700)> > > at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source) > > > > at > > > > > > > > > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)> > > at java.lang.reflect.Method.invoke(Method.java:597) > > > > at > > > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)> > > at > > > > > > > > > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)> > > Caused by: java.io.IOException: Could not obtain block: > > > > blk_6746847995679537137_51100 > > > > file=/hbase/.META./1028785192/info/mapfiles/2067000542076825598/data > > > > at > > > > > > > > > > > > > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1708)> > > at > > > > > > > > > > > > > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1536)> > > at > > > > > > > > > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1663)> > > at java.io.DataInputStream.readFully(DataInputStream.java:178) > > > > at java.io.DataInputStream.readFully(DataInputStream.java:152) > > > > at > > > > > > > > > org.apache.hadoop.hbase.io.SequenceFile$Reader.init(SequenceFile.java:1464)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.io.SequenceFile$Reader.(SequenceFile.java:1442)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.io.SequenceFile$Reader.(SequenceFile.java:1431)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.io.SequenceFile$Reader.(SequenceFile.java:1426)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.io.MapFile$Reader.createDataFileReader(MapFile.java:310)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.io.HBaseMapFile$HBaseReader.createDataFileReader(HBaseMapFile.java:96)> > > at org.apache.hadoop.hbase.io.MapFile$Reader.open(MapFile.java:292) > > > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.io.HBaseMapFile$HBaseReader.(HBaseMapFile.java:79)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.io.BloomFilterMapFile$Reader.(BloomFilterMapFile.java:65)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HStoreFile.getReader(HStoreFile.java:443)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.StoreFileScanner.openReaders(StoreFileScanner.java:96)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.StoreFileScanner.(StoreFileScanner.java:67)> > > ... 10 more > > > > > > > > at > > > sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > > > > at > > > > > > > > > > > > > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)> > > at > > > > > > > > > > > > > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)> > > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > > > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:95)> > > at > > > > > > > > > org.apache.hadoop.hbase.master.BaseScanner.scanRegion(BaseScanner.java:185)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.master.MetaScanner.scanOneMetaRegion(MetaScanner.java:73)> > > at > > > > > > > > > > > > > org.apache.hadoop.hbase.master.MetaScanner.maintenanceScan(MetaScanner.java:129)> > > at > > > > > > > org.apache.hadoop.hbase.master.BaseScanner.chore(BaseScanner.java:137)> > > at org.apache.hadoop.hbase.Chore.run(Chore.java:65) > > > > > > > > When I restart the cluster I have two instances of my table (with > > > the same > > > > name). > > > > > > > > I have just requested a major compaction, and everything seems to > > > be fine. > > > > Hadoop fsck don't find any problems. > > > > > > > > I have some questions : > > > > > > > > Does the .META or .ROOT tables could have been corrupted, do you > > > think some > > > > data have been lost from the table ? > > > > Is it safe to split or compact table during writes ? I thought it > > > was ok. > > > > > > > > Jérôme Thièvre > > > > > > > > >