hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: production usage of HBase
Date Sun, 18 Jan 2009 00:37:05 GMT

Other reasons to move to 0.19.0 are that it has fixes for issues raised 
by Andrew Purtell crawling straight into hbase using hbase-writer.  In 
particular, both the RPC and MapFiles would retain the shape of the 
largest byte buffer ever carried.  In other words, if you passed hbase a 
100MB page, then thereafter, the buffer used in RPC and in MapFile 
copying the content would retain the 100MB size and never snap back.  
Pass a few 100MB pages over RPC handlers and now you have a bunch of 
your heap rendered effectively dead (See here for more on this: 
https://issues.apache.org/jira/browse/HBASE-900?focusedCommentId=12654009#action_12654009).

St.Ack



Derek Pappas wrote:
>
>
>
>
> On Jan 16, 2009, at 9:04 PM, stack wrote:
>
>> Derek Pappas wrote:
>>> We are writing HTML files extracted from ARC files (from Heritrix) 
>>> to hbase.
>>> One run wrote 3 million HTML pages to Hbase before dying.
>>> We have implemented the hbase configuration based on the page you 
>>> directed me to.
>>> What kind of issues are you seeing with the machines?
>>> -dp
>>
>> Are you using the hbase-writer for Heritrix?
>
> No. See attached program. It parses the arc files and writes the html 
> records to hbase.
> 5 data nodes and 3 regions.
>
>>
>>
>> You should use hadoop 0.19.0 and hbase 0.19.0 if you can (An RC was 
>> put up today).  Much improved over 0.18.x (efficencies and performance).
>
>>
>>
>> Whats the client that is pulling apart the ARCs like?  Multithreaded 
>> single client or an MR job?
>
> Single threaded.
>
>>
>>
>> Tell us what you are seeing in your logs so we can help.  Make sure 
>> you have DEBUG enabled (see earlier in the FAQ that J-D pointed you 
>> at for how).
>>
>> Errors posted below, datanodes complaining of blocks, as J-D 
>> indicates, should be addressed mostly by the troubleshooting section 
>> he pointed you to.  You might also check datanode logs for errors.  
>> Could help give us a clue why the failures.
>>
>> Meantime, how many regions when it fails?  Tell us about your schema 
>> and your hardware.
>
> Dell 850's. Super Micro core duo's and a quad core.
>
> 5 data nodes 3 regions
>
>
>
>>
>>
>> Thanks,
>> St.Ack
>>
>>
>>>
>>>
>>> On Jan 16, 2009, at 8:23 PM, Jean-Daniel Cryans wrote:
>>>
>>>> We usually see those kinds of HDFS errors when it's overloaded with 
>>>> requests
>>>> from HBase. Those parameters should be enough... unless you didn't 
>>>> do this:
>>>> http://wiki.apache.org/hadoop/Hbase/FAQ#6
>>>>
>>>> A script that checks the config? What do you mean?
>>>>
>>>> J-D
>>>>
>>>> On Fri, Jan 16, 2009 at 11:19 PM, Derek Pappas <depappas@yahoo.com>

>>>> wrote:
>>>>
>>>>> J-D,
>>>>>
>>>>> Thanks for the reply. Will this solve most of the issues that we 
>>>>> listed in
>>>>> the email below
>>>>> or do we need to tune other params as well?
>>>>>
>>>>> Is there a script which checks configs?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> -dp
>>>>>
>>>>>
>>>>> On Jan 16, 2009, at 8:01 PM, Jean-Daniel Cryans wrote:
>>>>>
>>>>> Derek,
>>>>>>
>>>>>> We use hbase in semi-production mode, we've got
>>>
>>>>>> but mainly from the
>>>>>> machines themselves. Have you tried the following?
>>>>>> http://wiki.apache.org/hadoop/Hbase/Troubleshooting#6
>>>>>>
>>>>>> J-D
>>>>>>
>>>>>> On Fri, Jan 16, 2009 at 9:01 PM, Derek Pappas 
>>>>>> <depappas@yahoo.com> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>>
>>>>>>> Are any companies using hbase in a production system that can

>>>>>>> talk about
>>>>>>> hbase stability issues.
>>>>>>> We are a there person start up and need to choose the right storage
>>>>>>> system
>>>>>>> the first time.
>>>>>>> We are testing hbase 0.18 on a 7 machine cluster. We have seen

>>>>>>> all sorts
>>>>>>> of
>>>>>>> errors
>>>>>>> such as the following:
>>>>>>>
>>>>>>>
>>>>>>> 2009-01-16 16:31:49,710 WARN org.apache.hadoop.dfs.DFSClient:
Error
>>>>>>> Recovery for block nul
>>>>>>> l bad datanode[0]
>>>>>>> [zzz@xxx~]$ tail -f 
>>>>>>> hbase-0.18.1/logs/hbase-xxx-regionserver-xxxx0.log
>>>>>>>   at java.lang.reflect.Method.invoke(Unknown Source)
>>>>>>>   at
>>>>>>>
>>>>>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationH

>>>>>>>
>>>>>>> andler.java:82)
>>>>>>>   at
>>>>>>>
>>>>>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler

>>>>>>>
>>>>>>> .java:59)
>>>>>>>   at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
>>>>>>>   at
>>>>>>>
>>>>>>> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.

>>>>>>>
>>>>>>> java:2440)
>>>>>>>   at
>>>>>>>
>>>>>>> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient

>>>>>>>
>>>>>>> .java:2323)
>>>>>>>   at
>>>>>>>
>>>>>>> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1735

>>>>>>>
>>>>>>> )
>>>>>>>   at
>>>>>>>
>>>>>>> org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java

>>>>>>>
>>>>>>> :1912)
>>>>>>>
>>>>>>> 2009-01-16 16:31:49,710 WARN org.apache.hadoop.dfs.DFSClient:
Error
>>>>>>> Recovery for block nul
>>>>>>> l bad datanode[0]
>>>>>>> 5:30 PM
>>>>>>>
>>>>>>> on an error like this the one of the servers (and the data 
>>>>>>> inserts) just
>>>>>>> hangs
>>>>>>> 5:30 PM
>>>>>>>
>>>>>>> then you wait an hour or so to figure out whether it come out
of it
>>>>>>> 5:30 PM
>>>>>>>
>>>>>>> the other servers don't recoginize the one is gone
>>>>>>> 5:33 PM
>>>>>>>
>>>>>>> 2009-01-16 16:31:46,507 WARN org.apache.hadoop.dfs.DFSClient:
>>>>>>> NotReplicatedYetException sleeping
>>>>>>> /hbase/yotest1/689876272/size/mapfiles/8253971210487871616/index

>>>>>>> retries
>>>>>>> left 1
>>>>>>> 2009-01-16 16:31:49,710 WARN org.apache.hadoop.dfs.DFSClient:
>>>>>>> DataStreamer
>>>>>>> Exception: org.apache.hadoop.ipc.RemoteException:
>>>>>>> org.apache.hadoop.dfs.LeaseExpiredException: No lease on
>>>>>>> /hbase/yotest1/689876272/size/mapfiles/8253971210487871616/index

>>>>>>> File
>>>>>>> does
>>>>>>> not exist. Holder DFSClient_464109999 does not have any open
files.
>>>>>>>   at
>>>>>>> org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java:1169)

>>>>>>>
>>>>>>>   at
>>>>>>>
>>>>>>> org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1100)

>>>>>>>
>>>>>>>   at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
>>>>>>>   at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>>>>>>>   at
>>>>>>>
>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

>>>>>>>
>>>>>>>   at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>>>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
>>>>>>>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
>>>>>>>
>>>>>>>   at org.apache.hadoop.ipc.Client.call(Client.java:715)
>>>>>>>   at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>>>>>   at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
>>>>>>>
>>>>>>> hadoop (4) (1)
>>>>>>> 2009-01-16 08:26:12,017 WARN org.apache.hadoop.dfs.DataNode:
>>>>>>> DatanodeRegistration(10.7.0.104:50010,
>>>>>>> storageID=DS-603767860-10.7.0.104-50010-1230215140509, 
>>>>>>> infoPort=50075,
>>>>>>> ipcPort=50020):Failed to transfer 
>>>>>>> blk_-8100972070675150101_1897857 to
>>>>>>> 10.7.0.100:50010 got java.net.SocketException: Connection reset
>>>>>>>   at
>>>>>>> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96)
>>>>>>>   at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>>>>>>>   at 
>>>>>>> java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
>>>>>>>   at java.io.DataOutputStream.write(DataOutputStream.java:90)
>>>>>>>   at
>>>>>>> org.apache.hadoop.dfs.DataNode$BlockSender.sendChunks(DataNode.java:1923)

>>>>>>>
>>>>>>>   at
>>>>>>> org.apache.hadoop.dfs.DataNode$BlockSender.sendBlock(DataNode.java:2011)

>>>>>>>
>>>>>>>   at
>>>>>>> org.apache.hadoop.dfs.DataNode$DataTransfer.run(DataNode.java:2899)
>>>>>>>   at java.lang.Thread.run(Thread.java:595)
>>>>>>>
>>>>>>> 2009-01-16 08:39:18,952 ERROR org.apache.hadoop.dfs.DataNode:
>>>>>>> DatanodeRegistration(10.7.0.101:50010,
>>>>>>> storageID=DS-1644697266-10.7.0.101-50010-1230180097338, 
>>>>>>> infoPort=50075,
>>>>>>> ipcPort=50020):DataXceiver: java.net.SocketTimeoutException:

>>>>>>> Read timed
>>>>>>> out
>>>>>>>   at java.net.SocketInputStream.socketRead0(Native Method)
>>>>>>>   at java.net.SocketInputStream.read(SocketInputStream.java:129)
>>>>>>>   at java.net.SocketInputStream.read(SocketInputStream.java:182)
>>>>>>>   at java.io.DataInputStream.readByte(DataInputStream.java:248)
>>>>>>>   at
>>>>>>> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:324)

>>>>>>>
>>>>>>>   at
>>>>>>> org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:345)
>>>>>>>   at org.apache.hadoop.io.Text.readString(Text.java:410)
>>>>>>>   at
>>>>>>> org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:1270)

>>>>>>>
>>>>>>>   at 
>>>>>>> org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:1076)
>>>>>>>   at java.lang.Thread.run(Thread.java:619)
>>>>>>> 5:59 PM
>>>>>>>
>>>>>>> 2009-01-16 08:44:20,551 WARN org.apache.hadoop.dfs.DFSClient:
>>>>>>> DataStreamer
>>>>>>> Exception: java.net.SocketTimeoutException: 15000 millis timeout

>>>>>>> while
>>>>>>> waiting for channel to be ready for write. ch :
>>>>>>> java.nio.channels.SocketChannel[connected 
>>>>>>> local=/10.7.0.106:44905remote=/
>>>>>>> 10.7.0.106:50010]
>>>>>>>   at
>>>>>>>
>>>>>>> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:162)

>>>>>>>
>>>>>>>   at
>>>>>>>
>>>>>>> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146)

>>>>>>>
>>>>>>>   at
>>>>>>>
>>>>>>> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107)

>>>>>>>
>>>>>>>   at 
>>>>>>> java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
>>>>>>>   at java.io.DataOutputStream.write(DataOutputStream.java:90)
>>>>>>>   at
>>>>>>>
>>>>>>> org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1938)

>>>>>>>
>>>>>>> 5:59 PM
>>>>>>>
>>>>>>> ading from blk_6762060810858066967_1788520 of
>>>>>>> /hbase/yotest1/1831862944/resp/mapfiles/6379496651348145490/data

>>>>>>> from
>>>>>>> 10.7.0.104:50010: java.io.IOException: Premeture EOF from 
>>>>>>> inputStream
>>>>>>>   at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:102)
>>>>>>>   at
>>>>>>> org.apache.hadoop.dfs.DFSClient$BlockReader.readChunk(DFSClient.java:996)

>>>>>>>
>>>>>>>   at
>>>>>>>
>>>>>>> org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:236)

>>>>>>>
>>>>>>>   at 
>>>>>>> org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:191)
>>>>>>>   at 
>>>>>>> org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:159)
>>>>>>>   at
>>>>>>> org.apache.hadoop.dfs.DFSClient$BlockReader.read(DFSClient.java:858)

>>>>>>>
>>>>>>>   at
>>>>>>>
>>>>>>> org.apache.hadoop.dfs.DFSClient$DFSInputStream.readBuffer(DFSClient.java:1384)

>>>>>>>
>>>>>>>   at
>>>>>>> org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:1420)

>>>>>>>
>>>>>>>   at java.io.DataInputStream.readFully(DataInputStream.java:176)
>>>>>>>   at
>>>>>>>
>>>>>>> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:64)

>>>>>>>
>>>>>>>   at
>>>>>>> org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:102)

>>>>>>>
>>>>>>>   at
>>>>>>> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1933)

>>>>>>>
>>>>>>>   at
>>>>>>> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1833)

>>>>>>>
>>>>>>>   at
>>>>>>> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1879)

>>>>>>>
>>>>>>>   at org.apache.hadoop.io.MapFile$Reader.next(MapFile.java:516)
>>>>>>>   at
>>>>>>> org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1003)

>>>>>>>
>>>>>>>   at
>>>>>>> org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:893)

>>>>>>>
>>>>>>>   at
>>>>>>>
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:902)

>>>>>>>
>>>>>>>   at
>>>>>>>
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:860)

>>>>>>>
>>>>>>>   at
>>>>>>>
>>>>>>> org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:83)

>>>>>>>
>>>>>>> Best Regards,
>>>>>>>
>>>>>>> Derek Pappas
>>>>>>> depappas at yahoo d0t com
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>> Best Regards,
>>>>>
>>>>> Derek Pappas
>>>>> depappas at yahoo d0t com
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>>> Best Regards,
>>>
>>> Derek Pappas
>>> depappas at yahoo d0t com
>>>
>>>
>>>
>>>
>>>
>>
>
> Best Regards,
>
> Derek Pappas
> depappas at yahoo d0t com
>
>
>
>


Mime
View raw message