Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@locus.apache.org Received: (qmail 62632 invoked from network); 17 Jan 2009 23:25:06 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 17 Jan 2009 23:25:06 -0000 Received: (qmail 98298 invoked by uid 500); 17 Jan 2009 23:25:05 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 98285 invoked by uid 500); 17 Jan 2009 23:25:05 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 98274 invoked by uid 99); 17 Jan 2009 23:25:04 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 17 Jan 2009 15:25:04 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [98.136.44.56] (HELO smtp101.prem.mail.sp1.yahoo.com) (98.136.44.56) by apache.org (qpsmtpd/0.29) with SMTP; Sat, 17 Jan 2009 23:24:54 +0000 Received: (qmail 16567 invoked from network); 17 Jan 2009 23:24:33 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Received:X-Yahoo-Newman-Property:Message-Id:From:To:In-Reply-To:Content-Type:Mime-Version:Subject:Date:References:X-Mailer; b=YoZA3T/fC6h3Z2kjW54d8BOU1P8MF6zgcqpbUr+4LMBNb0OTgqBSg/bevhbfEwb7IRLuRvUEtzphL04ACJJ+Puqe6JnBvRl/VRRlvNhQecPfIgvQ2S9rnI7Et44kOh09FOdMzeGGxNHE9TMYh1l9QFmY6WKJrDEYGV3gN7VmRQQ= ; Received: from unknown (HELO ?172.16.1.201?) (depappas@64.105.170.147 with plain) by smtp101.prem.mail.sp1.yahoo.com with SMTP; 17 Jan 2009 23:24:32 -0000 X-Yahoo-Newman-Property: ymail-3 Message-Id: From: Derek Pappas To: hbase-user@hadoop.apache.org In-Reply-To: <497166CB.9090905@duboce.net> Content-Type: multipart/mixed; boundary=Apple-Mail-94-524402082 Mime-Version: 1.0 (Apple Message framework v929.2) Subject: Re: production usage of HBase Date: Sat, 17 Jan 2009 15:24:32 -0800 References: <6AE8216A-EF93-458D-8722-EE029D7BB7B8@yahoo.com> <31a243e70901162001w144025eer851012b2cdfaf82e@mail.gmail.com> <53B88CB6-63C0-4D25-905B-B0B76EE5989A@yahoo.com> <31a243e70901162023l22d2934fx1060dac364693cd8@mail.gmail.com> <497166CB.9090905@duboce.net> X-Mailer: Apple Mail (2.929.2) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-94-524402082 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit --Apple-Mail-94-524402082 Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit On Jan 16, 2009, at 9:04 PM, stack wrote: > Derek Pappas wrote: >> We are writing HTML files extracted from ARC files (from Heritrix) >> to hbase. >> One run wrote 3 million HTML pages to Hbase before dying. >> We have implemented the hbase configuration based on the page you >> directed me to. >> What kind of issues are you seeing with the machines? >> -dp > > Are you using the hbase-writer for Heritrix? No. See attached program. It parses the arc files and writes the html records to hbase. 5 data nodes and 3 regions. > > > You should use hadoop 0.19.0 and hbase 0.19.0 if you can (An RC was > put up today). Much improved over 0.18.x (efficencies and > performance). > > > Whats the client that is pulling apart the ARCs like? Multithreaded > single client or an MR job? Single threaded. > > > Tell us what you are seeing in your logs so we can help. Make sure > you have DEBUG enabled (see earlier in the FAQ that J-D pointed you > at for how). > > Errors posted below, datanodes complaining of blocks, as J-D > indicates, should be addressed mostly by the troubleshooting section > he pointed you to. You might also check datanode logs for errors. > Could help give us a clue why the failures. > > Meantime, how many regions when it fails? Tell us about your schema > and your hardware. Dell 850's. Super Micro core duo's and a quad core. 5 data nodes 3 regions > > > Thanks, > St.Ack > > >> >> >> On Jan 16, 2009, at 8:23 PM, Jean-Daniel Cryans wrote: >> >>> We usually see those kinds of HDFS errors when it's overloaded >>> with requests >>> from HBase. Those parameters should be enough... unless you didn't >>> do this: >>> http://wiki.apache.org/hadoop/Hbase/FAQ#6 >>> >>> A script that checks the config? What do you mean? >>> >>> J-D >>> >>> On Fri, Jan 16, 2009 at 11:19 PM, Derek Pappas >>> wrote: >>> >>>> J-D, >>>> >>>> Thanks for the reply. Will this solve most of the issues that we >>>> listed in >>>> the email below >>>> or do we need to tune other params as well? >>>> >>>> Is there a script which checks configs? >>>> >>>> Thanks, >>>> >>>> -dp >>>> >>>> >>>> On Jan 16, 2009, at 8:01 PM, Jean-Daniel Cryans wrote: >>>> >>>> Derek, >>>>> >>>>> We use hbase in semi-production mode, we've got >> >>>>> but mainly from the >>>>> machines themselves. Have you tried the following? >>>>> http://wiki.apache.org/hadoop/Hbase/Troubleshooting#6 >>>>> >>>>> J-D >>>>> >>>>> On Fri, Jan 16, 2009 at 9:01 PM, Derek Pappas >>>>> wrote: >>>>> >>>>> Hi, >>>>>> >>>>>> Are any companies using hbase in a production system that can >>>>>> talk about >>>>>> hbase stability issues. >>>>>> We are a there person start up and need to choose the right >>>>>> storage >>>>>> system >>>>>> the first time. >>>>>> We are testing hbase 0.18 on a 7 machine cluster. We have seen >>>>>> all sorts >>>>>> of >>>>>> errors >>>>>> such as the following: >>>>>> >>>>>> >>>>>> 2009-01-16 16:31:49,710 WARN org.apache.hadoop.dfs.DFSClient: >>>>>> Error >>>>>> Recovery for block nul >>>>>> l bad datanode[0] >>>>>> [zzz@xxx~]$ tail -f hbase-0.18.1/logs/hbase-xxx-regionserver- >>>>>> xxxx0.log >>>>>> at java.lang.reflect.Method.invoke(Unknown Source) >>>>>> at >>>>>> >>>>>> org >>>>>> .apache >>>>>> .hadoop >>>>>> .io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationH >>>>>> andler.java:82) >>>>>> at >>>>>> >>>>>> org >>>>>> .apache >>>>>> .hadoop >>>>>> .io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler >>>>>> .java:59) >>>>>> at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source) >>>>>> at >>>>>> >>>>>> org.apache.hadoop.dfs.DFSClient >>>>>> $DFSOutputStream.locateFollowingBlock(DFSClient. >>>>>> java:2440) >>>>>> at >>>>>> >>>>>> org.apache.hadoop.dfs.DFSClient >>>>>> $DFSOutputStream.nextBlockOutputStream(DFSClient >>>>>> .java:2323) >>>>>> at >>>>>> >>>>>> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access >>>>>> $1800(DFSClient.java:1735 >>>>>> ) >>>>>> at >>>>>> >>>>>> org.apache.hadoop.dfs.DFSClient$DFSOutputStream >>>>>> $DataStreamer.run(DFSClient.java >>>>>> :1912) >>>>>> >>>>>> 2009-01-16 16:31:49,710 WARN org.apache.hadoop.dfs.DFSClient: >>>>>> Error >>>>>> Recovery for block nul >>>>>> l bad datanode[0] >>>>>> 5:30 PM >>>>>> >>>>>> on an error like this the one of the servers (and the data >>>>>> inserts) just >>>>>> hangs >>>>>> 5:30 PM >>>>>> >>>>>> then you wait an hour or so to figure out whether it come out >>>>>> of it >>>>>> 5:30 PM >>>>>> >>>>>> the other servers don't recoginize the one is gone >>>>>> 5:33 PM >>>>>> >>>>>> 2009-01-16 16:31:46,507 WARN org.apache.hadoop.dfs.DFSClient: >>>>>> NotReplicatedYetException sleeping >>>>>> /hbase/yotest1/689876272/size/mapfiles/8253971210487871616/ >>>>>> index retries >>>>>> left 1 >>>>>> 2009-01-16 16:31:49,710 WARN org.apache.hadoop.dfs.DFSClient: >>>>>> DataStreamer >>>>>> Exception: org.apache.hadoop.ipc.RemoteException: >>>>>> org.apache.hadoop.dfs.LeaseExpiredException: No lease on >>>>>> /hbase/yotest1/689876272/size/mapfiles/8253971210487871616/ >>>>>> index File >>>>>> does >>>>>> not exist. Holder DFSClient_464109999 does not have any open >>>>>> files. >>>>>> at >>>>>> org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java: >>>>>> 1169) >>>>>> at >>>>>> >>>>>> org >>>>>> .apache >>>>>> .hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java: >>>>>> 1100) >>>>>> at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330) >>>>>> at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) >>>>>> at >>>>>> >>>>>> sun >>>>>> .reflect >>>>>> .DelegatingMethodAccessorImpl >>>>>> .invoke(DelegatingMethodAccessorImpl.java:25) >>>>>> at java.lang.reflect.Method.invoke(Method.java:597) >>>>>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452) >>>>>> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888) >>>>>> >>>>>> at org.apache.hadoop.ipc.Client.call(Client.java:715) >>>>>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) >>>>>> at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source) >>>>>> >>>>>> hadoop (4) (1) >>>>>> 2009-01-16 08:26:12,017 WARN org.apache.hadoop.dfs.DataNode: >>>>>> DatanodeRegistration(10.7.0.104:50010, >>>>>> storageID=DS-603767860-10.7.0.104-50010-1230215140509, >>>>>> infoPort=50075, >>>>>> ipcPort=50020):Failed to transfer >>>>>> blk_-8100972070675150101_1897857 to >>>>>> 10.7.0.100:50010 got java.net.SocketException: Connection reset >>>>>> at >>>>>> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java: >>>>>> 96) >>>>>> at java.net.SocketOutputStream.write(SocketOutputStream.java: >>>>>> 136) >>>>>> at >>>>>> java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) >>>>>> at java.io.DataOutputStream.write(DataOutputStream.java:90) >>>>>> at >>>>>> org.apache.hadoop.dfs.DataNode >>>>>> $BlockSender.sendChunks(DataNode.java:1923) >>>>>> at >>>>>> org.apache.hadoop.dfs.DataNode >>>>>> $BlockSender.sendBlock(DataNode.java:2011) >>>>>> at >>>>>> org.apache.hadoop.dfs.DataNode$DataTransfer.run(DataNode.java: >>>>>> 2899) >>>>>> at java.lang.Thread.run(Thread.java:595) >>>>>> >>>>>> 2009-01-16 08:39:18,952 ERROR org.apache.hadoop.dfs.DataNode: >>>>>> DatanodeRegistration(10.7.0.101:50010, >>>>>> storageID=DS-1644697266-10.7.0.101-50010-1230180097338, >>>>>> infoPort=50075, >>>>>> ipcPort=50020):DataXceiver: java.net.SocketTimeoutException: >>>>>> Read timed >>>>>> out >>>>>> at java.net.SocketInputStream.socketRead0(Native Method) >>>>>> at java.net.SocketInputStream.read(SocketInputStream.java:129) >>>>>> at java.net.SocketInputStream.read(SocketInputStream.java:182) >>>>>> at java.io.DataInputStream.readByte(DataInputStream.java:248) >>>>>> at >>>>>> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java: >>>>>> 324) >>>>>> at >>>>>> org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java: >>>>>> 345) >>>>>> at org.apache.hadoop.io.Text.readString(Text.java:410) >>>>>> at >>>>>> org.apache.hadoop.dfs.DataNode >>>>>> $DataXceiver.writeBlock(DataNode.java:1270) >>>>>> at org.apache.hadoop.dfs.DataNode >>>>>> $DataXceiver.run(DataNode.java:1076) >>>>>> at java.lang.Thread.run(Thread.java:619) >>>>>> 5:59 PM >>>>>> >>>>>> 2009-01-16 08:44:20,551 WARN org.apache.hadoop.dfs.DFSClient: >>>>>> DataStreamer >>>>>> Exception: java.net.SocketTimeoutException: 15000 millis >>>>>> timeout while >>>>>> waiting for channel to be ready for write. ch : >>>>>> java.nio.channels.SocketChannel[connected local=/ >>>>>> 10.7.0.106:44905remote=/ >>>>>> 10.7.0.106:50010] >>>>>> at >>>>>> >>>>>> org >>>>>> .apache >>>>>> .hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java: >>>>>> 162) >>>>>> at >>>>>> >>>>>> org >>>>>> .apache >>>>>> .hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146) >>>>>> at >>>>>> >>>>>> org >>>>>> .apache >>>>>> .hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107) >>>>>> at >>>>>> java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) >>>>>> at java.io.DataOutputStream.write(DataOutputStream.java:90) >>>>>> at >>>>>> >>>>>> org.apache.hadoop.dfs.DFSClient$DFSOutputStream >>>>>> $DataStreamer.run(DFSClient.java:1938) >>>>>> 5:59 PM >>>>>> >>>>>> ading from blk_6762060810858066967_1788520 of >>>>>> /hbase/yotest1/1831862944/resp/mapfiles/6379496651348145490/ >>>>>> data from >>>>>> 10.7.0.104:50010: java.io.IOException: Premeture EOF from >>>>>> inputStream >>>>>> at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:102) >>>>>> at >>>>>> org.apache.hadoop.dfs.DFSClient >>>>>> $BlockReader.readChunk(DFSClient.java:996) >>>>>> at >>>>>> >>>>>> org >>>>>> .apache >>>>>> .hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java: >>>>>> 236) >>>>>> at >>>>>> org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java: >>>>>> 191) >>>>>> at >>>>>> org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:159) >>>>>> at >>>>>> org.apache.hadoop.dfs.DFSClient$BlockReader.read(DFSClient.java: >>>>>> 858) >>>>>> at >>>>>> >>>>>> org.apache.hadoop.dfs.DFSClient >>>>>> $DFSInputStream.readBuffer(DFSClient.java:1384) >>>>>> at >>>>>> org.apache.hadoop.dfs.DFSClient >>>>>> $DFSInputStream.read(DFSClient.java:1420) >>>>>> at java.io.DataInputStream.readFully(DataInputStream.java:176) >>>>>> at >>>>>> >>>>>> org.apache.hadoop.io.DataOutputBuffer >>>>>> $Buffer.write(DataOutputBuffer.java:64) >>>>>> at >>>>>> org >>>>>> .apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java: >>>>>> 102) >>>>>> at >>>>>> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java: >>>>>> 1933) >>>>>> at >>>>>> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java: >>>>>> 1833) >>>>>> at >>>>>> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java: >>>>>> 1879) >>>>>> at org.apache.hadoop.io.MapFile$Reader.next(MapFile.java:516) >>>>>> at >>>>>> org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java: >>>>>> 1003) >>>>>> at >>>>>> org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java: >>>>>> 893) >>>>>> at >>>>>> >>>>>> org >>>>>> .apache >>>>>> .hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java: >>>>>> 902) >>>>>> at >>>>>> >>>>>> org >>>>>> .apache >>>>>> .hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java: >>>>>> 860) >>>>>> at >>>>>> >>>>>> org >>>>>> .apache >>>>>> .hadoop >>>>>> .hbase >>>>>> .regionserver.CompactSplitThread.run(CompactSplitThread.java:83) >>>>>> Best Regards, >>>>>> >>>>>> Derek Pappas >>>>>> depappas at yahoo d0t com >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>> Best Regards, >>>> >>>> Derek Pappas >>>> depappas at yahoo d0t com >>>> >>>> >>>> >>>> >>>> >> >> Best Regards, >> >> Derek Pappas >> depappas at yahoo d0t com >> >> >> >> >> > Best Regards, Derek Pappas depappas at yahoo d0t com --Apple-Mail-94-524402082--