hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: Re : Re: Strange bug split a table in two
Date Thu, 19 Feb 2009 07:30:16 GMT
A split while another is in process?  Thats a good one.  There are supposed
to be locks to protect against that happening.

Which hbase version?  Can you reproduce?  Whats your configuration look
like?

Thanks,
St.Ack


On Wed, Feb 18, 2009 at 10:45 PM, Michael Seibold <seibold@in.tum.de> wrote:

> Hi,
>
> I also have experienced similar issues with hbase 0.19.0 when inserting
> lots of data with the java client.
>
> 1. I keep adding data to region X
> 2. an automatic region split for X is started
> 3. I keep adding data to region X
> 4. an second automatic region split is started before the first
> automatic region split is finished
> -> 1. Crash of HBase regionserver 2. Crash of HBase master 3. Crash of
> client
>
> Kind regards,
> Michael
>
> El mié, 18-02-2009 a las 10:50 -0800, stack escribió:
> > If it doesn't work -- even during intense writing -- its a bug.
> > St.Ack
> >
> >
> > On Wed, Feb 18, 2009 at 10:33 AM, <jthievre@ina.fr> wrote:
> >
> > > Is it possible to request split or compaction during intensive write ?
> > >
> > >
> > >
> > > ----- Message d'origine -----
> > > De: stack <stack@duboce.net>
> > > Date: Mercredi, Février 18, 2009 6:38 pm
> > > Objet: Re: Strange bug split a table in two
> > >
> > > > Jérôme:
> > > >
> > > > Which version of hbase?
> > > >
> > > > Enable DEBUG.  See FAQ for how.  Have you read the getting started
> > > > where it
> > > > suggests you up the file descriptors?  See also end of the
> > > > troubleshootingpage for hadoop config. needed for hbase.
> > > >
> > > > How big are your tables?  How many rows/regions?
> > > >
> > > > St.Ack
> > > >
> > > >
> > > > On Wed, Feb 18, 2009 at 7:57 AM, Jérôme Thièvre INA
> > > > <jthievre@ina.fr> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > >
> > > > > During batch insertion of rows with java client in a table, I have
> > > > > requested
> > > > > a split of this table with the HBase web interface.
> > > > > The insertion process started to slowdown, and I think it's
> > > > normal, but
> > > > > then
> > > > > it stopped with no exception.
> > > > >
> > > > > So I stopped the hbase cluster with bin/stop-hbase.sh and every
> > > > region> server stopped normally (I don't kill any process).
> > > > >
> > > > > I take a look at the logs :
> > > > >
> > > > > *master logs firest exceptions :
> > > > >
> > > > > *2009-02-18 15:48:27,969 INFO
> > > > org.apache.hadoop.hbase.master.ServerManager:> Received
> > > > MSG_REPORT_SPLIT: metadata_table,r:
> > > > > http://net.series-tv.www/index.php?showforum=197,1234542589092:
> > > > > metadata_table,r:
> > > > > http://net.series-tv.www/index.php?showforum=197,1234542589092
> > > > split;> daughters: metadata_table,r:
> > > > > http://net.series-tv.www/index.php?showforum=197,1234968484302,
> > > > > metadata_table,r:
> > > > > http://net.series-tv.www/index.php?showtopic=6973,1234968484302from
> > > > > 10.1.188.16:60020
> > > > > 2009-02-18 15:48:27,969 INFO
> > > > org.apache.hadoop.hbase.master.RegionManager:> assigning region
> > > > metadata_table,r:> http://net.series-
> > > > tv.www/index.php?showtopic=6973,1234968484302 to server
> > > > > 10.1.188.16:60020
> > > > > 2009-02-18 15:48:27,970 INFO
> > > > org.apache.hadoop.hbase.master.RegionManager:> assigning region
> > > > metadata_table,r:> http://net.series-
> > > > tv.www/index.php?showforum=197,1234968484302 to server
> > > > > 10.1.188.16:60020
> > > > > 2009-02-18 15:48:29,555 INFO
> > > > org.apache.hadoop.hbase.master.ServerManager:> Received
> > > > MSG_REPORT_PROCESS_OPEN: metadata_table,r:
> > > > >
> > > > > http://fr.weborama.pro/fcgi-
> > > >
> > >
> bin/comptage.fcgi?ID=175809&MEDIA=MAIL&PAGE=1&ZONE=50000,1234968501145from>
> > > 10.1.188.179:60020
> > > > > 2009-02-18 15:48:29,555 INFO
> > > > org.apache.hadoop.hbase.master.ServerManager:> Received
> > > > MSG_REPORT_OPEN: metadata_table,r:
> > > > >
> > > > >
> > > >
> > >
> http://info.sirti.www/spip.php?id_article=320&page=galerie2,1234968501145from
> >
> > > 10.1.188.179:60020
> > > > > 2009-02-18 15:48:29,555 INFO
> > > > > org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> > > > metadata_table,r:>
> > > > >
> > > >
> > >
> http://info.sirti.www/spip.php?id_article=320&page=galerie2,1234968501145open
> >
> > > on
> > > > > 10.1.188.179:60020
> > > > > 2009-02-18 15:48:29,555 INFO
> > > > > org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> > > > > metadata_table,r:
> > > > >
> > > >
> > >
> http://info.sirti.www/spip.php?id_article=320&page=galerie2,1234968501145in
> >
> > > region .META.,,1 with startcode 1234946982368 and server
> > > > > 10.1.188.179:60020
> > > > > 2009-02-18 15:48:30,994 INFO
> > > > org.apache.hadoop.hbase.master.ServerManager:> Received
> > > > MSG_REPORT_PROCESS_OPEN: metadata_table,r:
> > > > > http://net.series-tv.www/index.php?showforum=197,1234968484302from
> > > > > 10.1.188.16:60020
> > > > > 2009-02-18 15:48:30,995 INFO
> > > > org.apache.hadoop.hbase.master.ServerManager:> Received
> > > > MSG_REPORT_OPEN: metadata_table,r:
> > > > > http://net.series-tv.www/index.php?showtopic=6973,1234968484302from
> > > > > 10.1.188.16:60020
> > > > > 2009-02-18 15:48:30,995 INFO
> > > > > org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> > > > metadata_table,r:> http://net.series-
> > > > tv.www/index.php?showtopic=6973,1234968484302 open on
> > > > > 10.1.188.16:60020
> > > > > 2009-02-18 15:48:30,995 INFO
> > > > > org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> > > > > metadata_table,r:
> > > > > http://net.series-tv.www/index.php?showtopic=6973,1234968484302
> > > > in region
> > > > > .META.,,1 with startcode 1234946972127 and server
> 10.1.188.16:60020
> > > > > 2009-02-18 15:48:40,006 INFO
> > > > org.apache.hadoop.hbase.master.ServerManager:> Received
> > > > MSG_REPORT_CLOSE: metadata_table,r:
> > > > > http://net.series-tv.www/index.php?showforum=197,1234968484302:
> > > > > java.io.IOException: Could not obtain block: blk_-
> > > > 6029004777792863005_53535>
> > > > >
> > > >
> > >
> file=/hbase/metadata_table/1933533649/location/info/912096781946009771.309611126>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1708)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1536)>
> > >    at
> > > > >
> > > >
> > >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1663)>
> > >  at
> > > > >
> > > >
> > >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1593)>
> > >  at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:320)
> > > > >    at java.io.DataInputStream.readUTF(DataInputStream.java:572)
> > > > >    at java.io.DataInputStream.readUTF(DataInputStream.java:547)
> > > > >    at
> > > > org.apache.hadoop.hbase.io.Reference.readFields(Reference.java:105)>
> > >  at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.regionserver.HStoreFile.readSplitInfo(HStoreFile.java:295)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.regionserver.HStore.loadHStoreFiles(HStore.java:436)>
> > >    at
> org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:230)
> > > > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1764)>
> > >    at
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:276)>
> > >  at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateRegion(HRegionServer.java:1367)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:1338)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:1253)>
> > >    at java.lang.Thread.run(Thread.java:619)
> > > > >  from 10.1.188.16:60020
> > > > > 2009-02-18 15:48:42,681 INFO
> > > > org.apache.hadoop.hbase.master.RegionManager:> assigning region
> > > > metadata_table,r:> http://net.series-
> > > > tv.www/index.php?showforum=197,1234968484302 to server
> > > > > 10.1.188.149:60020
> > > > > 2009-02-18 15:48:44,580 INFO
> > > > org.apache.hadoop.hbase.master.ServerManager:> Received
> > > > MSG_REPORT_CLOSE: metadata_table,r:
> > > > >
> > > > > http://fr.weborama.pro/fcgi-
> > > >
> bin/comptage.fcgi?ID=175809&MEDIA=MAIL&PAGE=1&ZONE=50000,1234968501145>
:
> > > > > java.io.IOException: Could not obtain block:
> > > > blk_1599510651183165167_53487>
> > > > >
> > > >
> > >
> file=/hbase/metadata_table/1127743078/type/info/5407628626802748081.1381909621>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1708)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1536)>
> > >    at
> > > > >
> > > >
> > >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1663)>
> > >  at
> > > > >
> > > >
> > >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1593)>
> > >  at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:320)
> > > > >    at java.io.DataInputStream.readUTF(DataInputStream.java:572)
> > > > >    at java.io.DataInputStream.readUTF(DataInputStream.java:547)
> > > > >    at
> > > > org.apache.hadoop.hbase.io.Reference.readFields(Reference.java:105)>
> > >  at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.regionserver.HStoreFile.readSplitInfo(HStoreFile.java:295)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.regionserver.HStore.loadHStoreFiles(HStore.java:436)>
> > >    at
> org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:230)
> > > > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1764)>
> > >    at
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:276)>
> > >  at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateRegion(HRegionServer.java:1367)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:1338)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:1253)>
> > >    at java.lang.Thread.run(Thread.java:619)
> > > > >  from 10.1.188.179:60020
> > > > > *
> > > > > And after few exception on differents regions :
> > > > >
> > > > > *009-02-18 15:49:29,955 WARN
> > > > org.apache.hadoop.hbase.master.BaseScanner:> Scan one META region:
> > > > {regionname: .META.,,1, startKey: <>, server:
> > > > > 10.1.188.16:60020}
> > > > > java.io.IOException: java.io.IOException: HStoreScanner failed
> > > > construction>    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.<init>(StoreFileScanner.java:70)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.regionserver.HStoreScanner.<init>(HStoreScanner.java:88)>
> > >    at
> > > > >
> > > >
> org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:2125)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.regionserver.HRegion$HScanner.<init>(HRegion.java:1989)>
> > >    at
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1180)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1700)>
> > >    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
> > > > >    at
> > > > >
> > > > >
> > > >
> > >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)>
> > >    at java.lang.reflect.Method.invoke(Method.java:597)
> > > > >    at
> > > > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)>
> > >  at
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)>
> > > Caused by: java.io.IOException: Could not obtain block:
> > > > > blk_6746847995679537137_51100
> > > > >
> file=/hbase/.META./1028785192/info/mapfiles/2067000542076825598/data
> > > > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1708)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1536)>
> > >    at
> > > > >
> > > >
> > >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1663)>
> > >  at java.io.DataInputStream.readFully(DataInputStream.java:178)
> > > > >    at java.io.DataInputStream.readFully(DataInputStream.java:152)
> > > > >    at
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.io.SequenceFile$Reader.init(SequenceFile.java:1464)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.io.SequenceFile$Reader.<init>(SequenceFile.java:1442)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.io.SequenceFile$Reader.<init>(SequenceFile.java:1431)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.io.SequenceFile$Reader.<init>(SequenceFile.java:1426)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.io.MapFile$Reader.createDataFileReader(MapFile.java:310)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.io.HBaseMapFile$HBaseReader.createDataFileReader(HBaseMapFile.java:96)>
> > >    at org.apache.hadoop.hbase.io.MapFile$Reader.open(MapFile.java:292)
> > > > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.io.HBaseMapFile$HBaseReader.<init>(HBaseMapFile.java:79)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.io.BloomFilterMapFile$Reader.<init>(BloomFilterMapFile.java:65)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.regionserver.HStoreFile.getReader(HStoreFile.java:443)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.openReaders(StoreFileScanner.java:96)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.<init>(StoreFileScanner.java:67)>
> > >    ... 10 more
> > > > >
> > > > >    at
> > > > sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> > > > >    at
> > > > >
> > > > >
> > > >
> > >
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)>
> > >    at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> > > > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:95)>
> > >    at
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.master.BaseScanner.scanRegion(BaseScanner.java:185)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.master.MetaScanner.scanOneMetaRegion(MetaScanner.java:73)>
> > >    at
> > > > >
> > > > >
> > > >
> > >
> org.apache.hadoop.hbase.master.MetaScanner.maintenanceScan(MetaScanner.java:129)>
> > >    at
> > > > >
> > > >
> org.apache.hadoop.hbase.master.BaseScanner.chore(BaseScanner.java:137)>
> > >  at org.apache.hadoop.hbase.Chore.run(Chore.java:65)
> > > > >
> > > > > When I restart the cluster I have two instances of my table (with
> > > > the same
> > > > > name).
> > > > >
> > > > > I have just requested a major compaction, and everything seems to
> > > > be fine.
> > > > > Hadoop fsck don't find any problems.
> > > > >
> > > > > I have some questions :
> > > > >
> > > > > Does the .META or .ROOT tables could have been corrupted, do you
> > > > think some
> > > > > data have been lost from the table ?
> > > > > Is it safe to split or compact table during writes ? I thought it
> > > > was ok.
> > > > >
> > > > > Jérôme Thièvre
> > > > >
> > > >
> > >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message