Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 26507 invoked from network); 30 Apr 2010 08:19:11 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 30 Apr 2010 08:19:11 -0000 Received: (qmail 64050 invoked by uid 500); 30 Apr 2010 04:15:10 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 63938 invoked by uid 500); 30 Apr 2010 04:15:10 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 63929 invoked by uid 99); 30 Apr 2010 04:15:09 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Apr 2010 04:15:09 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ryanobjc@gmail.com designates 209.85.160.48 as permitted sender) Received: from [209.85.160.48] (HELO mail-pw0-f48.google.com) (209.85.160.48) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Apr 2010 04:15:03 +0000 Received: by pwi2 with SMTP id 2so991770pwi.35 for ; Thu, 29 Apr 2010 21:14:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=2VDJaXcU7byFdZATfBM7/784hsL5QawHFbsKR6upaLA=; b=RwdRg4kvpLZq4+XFf+zvGbTB6scxkO9n4jkpPqlyRJCC8mAeZIxoiFTNJb+vX+vBwq xkQji6qqo+YXDqDL+nnjQ7+BMy2xLklVtbMigxG0AWtSg/4Lf4TkYRoGHPyGMkRfTbbL j6KU4Ry2RcqnszouEeoo4fEtfkWD84GTI2n9I= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=ZgW40s1vgpnqVIvsvyO4pmV89nGvoPhJyncTF7LbAvwUmYHxIyLgmGzJidxxziQBNC C2iv5NOr5shHMNafq7CcOJS5lKG3lVRDFK+ZYZiOKrYqsOVQrlKdp4JxDJodF3ZpDsxr FOoY35TR02RCB4EGO0jRA3VuJo6Uyds6RSYqA= MIME-Version: 1.0 Received: by 10.141.187.13 with SMTP id o13mr423538rvp.24.1272600881447; Thu, 29 Apr 2010 21:14:41 -0700 (PDT) Received: by 10.141.48.11 with HTTP; Thu, 29 Apr 2010 21:14:41 -0700 (PDT) In-Reply-To: References: <8D66B74984F9564BBB25C3C67D630F2D674A79BD@SC-MBXC1.TheFacebook.com> <5575B9C7-651C-4143-BB86-090D85C85334@email.com> <45E7E4F6-9881-4F56-B52E-CF7BB175F06F@gmail.com> Date: Thu, 29 Apr 2010 21:14:41 -0700 Message-ID: Subject: Re: EC2 + Thrift inserts From: Ryan Rawson To: hbase-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org One thing to check is at the peak of your load, run jstack on one of the regionservers, and look at the handler threads - if all of them are doing something you might be running into handler contention. it is basically ultimately IO bound. -ryan On Thu, Apr 29, 2010 at 9:12 PM, Chris Tarnas wrote: > They are all at 100, but none of the regionservers are loaded - most are > less than 20% CPU. Is this all network latency? > > -chris > > On Apr 29, 2010, at 8:29 PM, Ryan Rawson wrote: > >> Every insert on an indexed would require at the very least an RPC to a >> different regionserver. =A0If the regionservers are busy, your request >> could wait in the queue for a moment. >> >> One param to tune would be the handler thread count. =A0Set it to 100 at >> least. >> >> On Thu, Apr 29, 2010 at 2:16 AM, Chris Tarnas wrote: >>> >>> I just finished some testing with JDK 1.6 u17 - so far no performance >>> improvements with just changing that. Disabling LZO compression did gai= n a >>> little bit (up to about 30/sec from 25/sec per thread). Turning of inde= xes >>> helped the most - that brought me up to 115/sec @ 2875 total rows a sec= ond. >>> A single perl/thrift process can load at over 350 rows/sec so its not >>> scaling as well as I would have expected, even without the indexes. >>> >>> Are the transactional indexes that costly? What is the bottleneck there= ? >>> CPU utilization and network packets went up when I disabled the indexes= , I >>> don't think those are the bottlenecks for the indexes. I was even able = to >>> add another 15 insert process (total of 40) and only lost about 10% on = a per >>> process throughput. I probably could go even higher, none of the nodes = are >>> above CPU 60% utilization and IO wait was at most 3.5%. >>> >>> Each rowkey is unique, so there should not be any blocking on the row >>> locks. I'll do more indexed tests tomorrow. >>> >>> thanks, >>> -chris >>> >>> >>> >>> >>> >>> >>> >>> On Apr 29, 2010, at 12:18 AM, Todd Lipcon wrote: >>> >>>> Definitely smells like JDK 1.6.0_18. Downgrade that back to 16 or 17 a= nd >>>> you >>>> should be good to go. _18 is a botched release if I ever saw one. >>>> >>>> -Todd >>>> >>>> On Wed, Apr 28, 2010 at 10:54 PM, Chris Tarnas wrote: >>>> >>>>> Hi Stack, >>>>> >>>>> Thanks for looking. I checked the ganglia charts, no server was at mo= re >>>>> than ~20% CPU utilization at any time during the load test and swap w= as >>>>> never used. Network traffic was light - just running a count through >>>>> hbase >>>>> shell generates a much higher use. One the server hosting meta >>>>> specifically, >>>>> it was at about 15-20% CPU, and IO wait never went above 3%, was >>>>> usually >>>>> down at near 0. >>>>> >>>>> The load also died with a thrift timeout on every single node (each >>>>> node >>>>> connecting to localhost for its thrift server), it looks like a >>>>> datanode >>>>> just died and caused every thrift connection to timeout - I'll have t= o >>>>> up >>>>> that limit to handle a node death. >>>>> >>>>> Checking logs this appears in the logs of the region server hosting >>>>> meta, >>>>> looks like the dead datanode causing this error: >>>>> >>>>> 2010-04-29 01:01:38,948 WARN org.apache.hadoop.hdfs.DFSClient: >>>>> DFSOutputStream ResponseProcessor exception =A0for block >>>>> blk_508630839844593817_11180java.io.IOException: Bad response 1 for >>>>> block >>>>> blk_508630839844593817_11180 from datanode 10.195.150.255:50010 >>>>> =A0 =A0 =A0at >>>>> >>>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.ru= n(DFSClient.java:2423) >>>>> >>>>> The regionserver log on teh dead node, 10.195.150.255 has some more >>>>> errors >>>>> in it: >>>>> >>>>> http://pastebin.com/EFH9jz0w >>>>> >>>>> I found this in the .out file on the datanode: >>>>> >>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (16.0-b13 mixed mode >>>>> linux-amd64 ) >>>>> # Problematic frame: >>>>> # V =A0[libjvm.so+0x62263c] >>>>> # >>>>> # An error report file with more information is saved as: >>>>> # /usr/local/hadoop-0.20.1/hs_err_pid1364.log >>>>> # >>>>> # If you would like to submit a bug report, please visit: >>>>> # =A0 http://java.sun.com/webapps/bugreport/crash.jsp >>>>> # >>>>> >>>>> >>>>> There is not a single error in the datanode's log though. Also of not= e >>>>> - >>>>> this happened well into the test, so the node dying cause the load to >>>>> abort >>>>> but not the prior poor performance. Looking through the mailing list = it >>>>> looks like java 1.6.0_18 has a bad rep so I'll update the AMI (althou= gh >>>>> I'm >>>>> using the same JVM on other servers in the office w/o issue and decen= t >>>>> single node performance and never dying...). >>>>> >>>>> Thanks for any help! >>>>> -chris >>>>> >>>>> >>>>> >>>>> >>>>> On Apr 28, 2010, at 10:10 PM, Stack wrote: >>>>> >>>>>> What is load on the server hosting meta like? =A0Higher than others? >>>>>> >>>>>> >>>>>> >>>>>> On Apr 28, 2010, at 8:42 PM, Chris Tarnas wrote: >>>>>> >>>>>>> Hi JG, >>>>>>> >>>>>>> Speed is now down to 18 rows/sec/table per process. >>>>>>> >>>>>>> Here is a regionserver log that is serving two of the regions: >>>>>>> >>>>>>> http://pastebin.com/Hx5se0hz >>>>>>> >>>>>>> Here is the GC Log from the same server: >>>>>>> >>>>>>> http://pastebin.com/ChrRvxCx >>>>>>> >>>>>>> Here is the master log: >>>>>>> >>>>>>> http://pastebin.com/L1Kn66qU >>>>>>> >>>>>>> The thrift server logs have nothing in them in the same time period= . >>>>>>> >>>>>>> Thanks in advance! >>>>>>> >>>>>>> -chris >>>>>>> >>>>>>> On Apr 28, 2010, at 7:32 PM, Jonathan Gray wrote: >>>>>>> >>>>>>>> Hey Chris, >>>>>>>> >>>>>>>> That's a really significant slowdown. =A0I can't think of anything >>>>> >>>>> obvious that would cause that in your setup. >>>>>>>> >>>>>>>> Any chance of some regionserver and master logs from the time it w= as >>>>> >>>>> going slow? =A0Is there any activity in the logs of the regionservers >>>>> hosting >>>>> the regions of the table being written to? >>>>>>>> >>>>>>>> JG >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: Christopher Tarnas [mailto:cft@tarnas.org] On Behalf Of Chr= is >>>>>>>>> Tarnas >>>>>>>>> Sent: Wednesday, April 28, 2010 6:27 PM >>>>>>>>> To: hbase-user@hadoop.apache.org >>>>>>>>> Subject: EC2 + Thrift inserts >>>>>>>>> >>>>>>>>> Hello all, >>>>>>>>> >>>>>>>>> First, thanks to all the HBase developers for producing this, it'= s >>>>>>>>> a >>>>>>>>> great project and I'm glad to be able to use it. >>>>>>>>> >>>>>>>>> I'm looking for some help and hints here with insert performance >>>>>>>>> help. >>>>>>>>> I'm doing some benchmarking, testing how I can scale up using >>>>>>>>> HBase, >>>>>>>>> not really looking at raw speed. The testing is happening on EC2, >>>>> >>>>> using >>>>>>>>> >>>>>>>>> Andrew's scripts (thanks - those were very helpful) to set them u= p >>>>>>>>> and >>>>>>>>> with a slightly customized version of the default AMIs (added my >>>>>>>>> application modules). I'm using HBase 20.3 and Hadoop 20.1. I've >>>>> >>>>> looked >>>>>>>>> >>>>>>>>> at the tips in the Wiki and it looks like Andrew's scripts are >>>>>>>>> already >>>>>>>>> setup that way. >>>>>>>>> >>>>>>>>> I'm inserting into HBase from a hadoop streaming job that runs pe= rl >>>>> >>>>> and >>>>>>>>> >>>>>>>>> uses the thrift gateway. I'm also using the Transactional tables = so >>>>>>>>> that alone could be the case, but from what I can tell I don't >>>>>>>>> think >>>>>>>>> so. LZO compression is also enabled for the column families (much >>>>>>>>> of >>>>>>>>> the data is highly compressible). My cluster has 7 nodes, 5 >>>>>>>>> regionservers, 1 master and 1 zookeeper. The regionservers and >>>>>>>>> master >>>>>>>>> are c1.xlarges. Each regionserver has the tasktrackers that runs >>>>>>>>> the >>>>>>>>> hadoop streaming jobs, and regionserver also runs its own thrift >>>>>>>>> server. Each mapper that does the load talks to the localhost's >>>>>>>>> thrift >>>>>>>>> server. >>>>>>>>> >>>>>>>>> The Row keys a fixed string + an incremental number then the orde= r >>>>>>>>> of >>>>>>>>> the bytes are reversed, so runA123 becomes 321Anur. I though of >>>>>>>>> using >>>>>>>>> murmur hash but was worried about collisions. >>>>>>>>> >>>>>>>>> As I add more insert jobs, each jobs throughput goes down. Way >>>>>>>>> down. I >>>>>>>>> went from about 200 row/sec/table per job with one job to about 2= 4 >>>>>>>>> rows/sec/table per job with 25 running jobs. The servers are most= ly >>>>>>>>> idle. I'm loading into two tables, one has several indexes and I'= m >>>>>>>>> loading into three column families, the other has no indexes and >>>>>>>>> one >>>>>>>>> column family. Both tables only currently have two region each. >>>>>>>>> >>>>>>>>> The regionserver that serves the indexed table's regions is using >>>>>>>>> the >>>>>>>>> most CPU but is 87% idle. The other servers are all at ~90% idle. >>>>> >>>>> There >>>>>>>>> >>>>>>>>> is no IO wait. the perl processes are barely ticking over. Java o= n >>>>>>>>> the >>>>>>>>> most "loaded" server is using about 50-60% of one CPU. >>>>>>>>> >>>>>>>>> Normally when I do load in a pseudo-distrbuted hbase (my >>>>>>>>> development >>>>>>>>> platform) perl's speed is the limiting factor and uses about 85% = of >>>>>>>>> a >>>>>>>>> CPU. In this cluster they are using only 5-10% of a CPU as they a= re >>>>> >>>>> all >>>>>>>>> >>>>>>>>> waiting on thrift (hbase). When I run only 1 process on the >>>>>>>>> cluster, >>>>>>>>> perl uses much more of a CPU, maybe 70%. >>>>>>>>> >>>>>>>>> Any tips or help in getting the speed/scalability up would be >>>>>>>>> great. >>>>>>>>> Please let me know if you need any other info. >>>>>>>>> >>>>>>>>> As I send this - it looks like the main table has split again and >>>>>>>>> is >>>>>>>>> being served by three regionservers.. My performance is going up = a >>>>>>>>> bit >>>>>>>>> (now 35 rows/sec/table per processes), but still seems like I'm n= ot >>>>>>>>> using the full potential of even the limited EC2 system, no IO wa= it >>>>> >>>>> and >>>>>>>>> >>>>>>>>> lots of idle CPU. >>>>>>>>> >>>>>>>>> >>>>>>>>> many thanks >>>>>>>>> -chris >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Todd Lipcon >>>> Software Engineer, Cloudera >>> >>> >