Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hbase-user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of ryanobjc@gmail.com designates
 209.85.160.48 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type:content-transfer-encoding;
        b=ZgW40s1vgpnqVIvsvyO4pmV89nGvoPhJyncTF7LbAvwUmYHxIyLgmGzJidxxziQBNC
         C2iv5NOr5shHMNafq7CcOJS5lKG3lVRDFK+ZYZiOKrYqsOVQrlKdp4JxDJodF3ZpDsxr
         FOoY35TR02RCB4EGO0jRA3VuJo6Uyds6RSYqA=
MIME-Version: 1.0
In-Reply-To: <FF9D17E8-042E-4D8D-B204-8B780F5642E1@email.com>
References: <B4290FB4-C9E2-41AD-AD60-BBA943BD61A9@email.com>
	 <8D66B74984F9564BBB25C3C67D630F2D674A79BD@SC-MBXC1.TheFacebook.com>
	 <5575B9C7-651C-4143-BB86-090D85C85334@email.com>
	 <45E7E4F6-9881-4F56-B52E-CF7BB175F06F@gmail.com>
	 <A33CD691-E7D6-441E-A2C1-2BA49FABD020@email.com>
	 <v2v45f85f71004290018q8335ed79p7f160825d159bf34@mail.gmail.com>
	 <F0D10B13-58B1-487D-8844-ED7542DD8C34@email.com>
	 <u2v78568af11004292029y3e6db43bh8fc4b7f459990bbe@mail.gmail.com>
	 <FF9D17E8-042E-4D8D-B204-8B780F5642E1@email.com>
Date: Thu, 29 Apr 2010 21:14:41 -0700
Message-ID: <q2q78568af11004292114m3d736317v3bc200b64aa4eca0@mail.gmail.com>
Subject: Re: EC2 + Thrift inserts
From: Ryan Rawson <ryanobjc@gmail.com>
To: hbase-user@hadoop.apache.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

One thing to check is at the peak of your load, run jstack on one of
the regionservers, and look at the handler threads - if all of them
are doing something you might be running into handler contention.

it is basically ultimately IO bound.

-ryan

On Thu, Apr 29, 2010 at 9:12 PM, Chris Tarnas <cft@email.com> wrote:
> They are all at 100, but none of the regionservers are loaded - most are
> less than 20% CPU. Is this all network latency?
>
> -chris
>
> On Apr 29, 2010, at 8:29 PM, Ryan Rawson <ryanobjc@gmail.com> wrote:
>
>> Every insert on an indexed would require at the very least an RPC to a
>> different regionserver. =A0If the regionservers are busy, your request
>> could wait in the queue for a moment.
>>
>> One param to tune would be the handler thread count. =A0Set it to 100 at
>> least.
>>
>> On Thu, Apr 29, 2010 at 2:16 AM, Chris Tarnas <cft@email.com> wrote:
>>>
>>> I just finished some testing with JDK 1.6 u17 - so far no performance
>>> improvements with just changing that. Disabling LZO compression did gai=
n a
>>> little bit (up to about 30/sec from 25/sec per thread). Turning of inde=
xes
>>> helped the most - that brought me up to 115/sec @ 2875 total rows a sec=
ond.
>>> A single perl/thrift process can load at over 350 rows/sec so its not
>>> scaling as well as I would have expected, even without the indexes.
>>>
>>> Are the transactional indexes that costly? What is the bottleneck there=
?
>>> CPU utilization and network packets went up when I disabled the indexes=
, I
>>> don't think those are the bottlenecks for the indexes. I was even able =
to
>>> add another 15 insert process (total of 40) and only lost about 10% on =
a per
>>> process throughput. I probably could go even higher, none of the nodes =
are
>>> above CPU 60% utilization and IO wait was at most 3.5%.
>>>
>>> Each rowkey is unique, so there should not be any blocking on the row
>>> locks. I'll do more indexed tests tomorrow.
>>>
>>> thanks,
>>> -chris
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Apr 29, 2010, at 12:18 AM, Todd Lipcon wrote:
>>>
>>>> Definitely smells like JDK 1.6.0_18. Downgrade that back to 16 or 17 a=
nd
>>>> you
>>>> should be good to go. _18 is a botched release if I ever saw one.
>>>>
>>>> -Todd
>>>>
>>>> On Wed, Apr 28, 2010 at 10:54 PM, Chris Tarnas <cft@email.com> wrote:
>>>>
>>>>> Hi Stack,
>>>>>
>>>>> Thanks for looking. I checked the ganglia charts, no server was at mo=
re
>>>>> than ~20% CPU utilization at any time during the load test and swap w=
as
>>>>> never used. Network traffic was light - just running a count through
>>>>> hbase
>>>>> shell generates a much higher use. One the server hosting meta
>>>>> specifically,
>>>>> it was at about 15-20% CPU, and IO wait never went above 3%, was
>>>>> usually
>>>>> down at near 0.
>>>>>
>>>>> The load also died with a thrift timeout on every single node (each
>>>>> node
>>>>> connecting to localhost for its thrift server), it looks like a
>>>>> datanode
>>>>> just died and caused every thrift connection to timeout - I'll have t=
o
>>>>> up
>>>>> that limit to handle a node death.
>>>>>
>>>>> Checking logs this appears in the logs of the region server hosting
>>>>> meta,
>>>>> looks like the dead datanode causing this error:
>>>>>
>>>>> 2010-04-29 01:01:38,948 WARN org.apache.hadoop.hdfs.DFSClient:
>>>>> DFSOutputStream ResponseProcessor exception =A0for block
>>>>> blk_508630839844593817_11180java.io.IOException: Bad response 1 for
>>>>> block
>>>>> blk_508630839844593817_11180 from datanode 10.195.150.255:50010
>>>>> =A0 =A0 =A0at
>>>>>
>>>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.ru=
n(DFSClient.java:2423)
>>>>>
>>>>> The regionserver log on teh dead node, 10.195.150.255 has some more
>>>>> errors
>>>>> in it:
>>>>>
>>>>> http://pastebin.com/EFH9jz0w
>>>>>
>>>>> I found this in the .out file on the datanode:
>>>>>
>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (16.0-b13 mixed mode
>>>>> linux-amd64 )
>>>>> # Problematic frame:
>>>>> # V =A0[libjvm.so+0x62263c]
>>>>> #
>>>>> # An error report file with more information is saved as:
>>>>> # /usr/local/hadoop-0.20.1/hs_err_pid1364.log
>>>>> #
>>>>> # If you would like to submit a bug report, please visit:
>>>>> # =A0 http://java.sun.com/webapps/bugreport/crash.jsp
>>>>> #
>>>>>
>>>>>
>>>>> There is not a single error in the datanode's log though. Also of not=
e
>>>>> -
>>>>> this happened well into the test, so the node dying cause the load to
>>>>> abort
>>>>> but not the prior poor performance. Looking through the mailing list =
it
>>>>> looks like java 1.6.0_18 has a bad rep so I'll update the AMI (althou=
gh
>>>>> I'm
>>>>> using the same JVM on other servers in the office w/o issue and decen=
t
>>>>> single node performance and never dying...).
>>>>>
>>>>> Thanks for any help!
>>>>> -chris
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Apr 28, 2010, at 10:10 PM, Stack wrote:
>>>>>
>>>>>> What is load on the server hosting meta like? =A0Higher than others?
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Apr 28, 2010, at 8:42 PM, Chris Tarnas <cft@email.com> wrote:
>>>>>>
>>>>>>> Hi JG,
>>>>>>>
>>>>>>> Speed is now down to 18 rows/sec/table per process.
>>>>>>>
>>>>>>> Here is a regionserver log that is serving two of the regions:
>>>>>>>
>>>>>>> http://pastebin.com/Hx5se0hz
>>>>>>>
>>>>>>> Here is the GC Log from the same server:
>>>>>>>
>>>>>>> http://pastebin.com/ChrRvxCx
>>>>>>>
>>>>>>> Here is the master log:
>>>>>>>
>>>>>>> http://pastebin.com/L1Kn66qU
>>>>>>>
>>>>>>> The thrift server logs have nothing in them in the same time period=
.
>>>>>>>
>>>>>>> Thanks in advance!
>>>>>>>
>>>>>>> -chris
>>>>>>>
>>>>>>> On Apr 28, 2010, at 7:32 PM, Jonathan Gray wrote:
>>>>>>>
>>>>>>>> Hey Chris,
>>>>>>>>
>>>>>>>> That's a really significant slowdown. =A0I can't think of anything
>>>>>
>>>>> obvious that would cause that in your setup.
>>>>>>>>
>>>>>>>> Any chance of some regionserver and master logs from the time it w=
as
>>>>>
>>>>> going slow? =A0Is there any activity in the logs of the regionservers
>>>>> hosting
>>>>> the regions of the table being written to?
>>>>>>>>
>>>>>>>> JG
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Christopher Tarnas [mailto:cft@tarnas.org] On Behalf Of Chr=
is
>>>>>>>>> Tarnas
>>>>>>>>> Sent: Wednesday, April 28, 2010 6:27 PM
>>>>>>>>> To: hbase-user@hadoop.apache.org
>>>>>>>>> Subject: EC2 + Thrift inserts
>>>>>>>>>
>>>>>>>>> Hello all,
>>>>>>>>>
>>>>>>>>> First, thanks to all the HBase developers for producing this, it'=
s
>>>>>>>>> a
>>>>>>>>> great project and I'm glad to be able to use it.
>>>>>>>>>
>>>>>>>>> I'm looking for some help and hints here with insert performance
>>>>>>>>> help.
>>>>>>>>> I'm doing some benchmarking, testing how I can scale up using
>>>>>>>>> HBase,
>>>>>>>>> not really looking at raw speed. The testing is happening on EC2,
>>>>>
>>>>> using
>>>>>>>>>
>>>>>>>>> Andrew's scripts (thanks - those were very helpful) to set them u=
p
>>>>>>>>> and
>>>>>>>>> with a slightly customized version of the default AMIs (added my
>>>>>>>>> application modules). I'm using HBase 20.3 and Hadoop 20.1. I've
>>>>>
>>>>> looked
>>>>>>>>>
>>>>>>>>> at the tips in the Wiki and it looks like Andrew's scripts are
>>>>>>>>> already
>>>>>>>>> setup that way.
>>>>>>>>>
>>>>>>>>> I'm inserting into HBase from a hadoop streaming job that runs pe=
rl
>>>>>
>>>>> and
>>>>>>>>>
>>>>>>>>> uses the thrift gateway. I'm also using the Transactional tables =
so
>>>>>>>>> that alone could be the case, but from what I can tell I don't
>>>>>>>>> think
>>>>>>>>> so. LZO compression is also enabled for the column families (much
>>>>>>>>> of
>>>>>>>>> the data is highly compressible). My cluster has 7 nodes, 5
>>>>>>>>> regionservers, 1 master and 1 zookeeper. The regionservers and
>>>>>>>>> master
>>>>>>>>> are c1.xlarges. Each regionserver has the tasktrackers that runs
>>>>>>>>> the
>>>>>>>>> hadoop streaming jobs, and regionserver also runs its own thrift
>>>>>>>>> server. Each mapper that does the load talks to the localhost's
>>>>>>>>> thrift
>>>>>>>>> server.
>>>>>>>>>
>>>>>>>>> The Row keys a fixed string + an incremental number then the orde=
r
>>>>>>>>> of
>>>>>>>>> the bytes are reversed, so runA123 becomes 321Anur. I though of
>>>>>>>>> using
>>>>>>>>> murmur hash but was worried about collisions.
>>>>>>>>>
>>>>>>>>> As I add more insert jobs, each jobs throughput goes down. Way
>>>>>>>>> down. I
>>>>>>>>> went from about 200 row/sec/table per job with one job to about 2=
4
>>>>>>>>> rows/sec/table per job with 25 running jobs. The servers are most=
ly
>>>>>>>>> idle. I'm loading into two tables, one has several indexes and I'=
m
>>>>>>>>> loading into three column families, the other has no indexes and
>>>>>>>>> one
>>>>>>>>> column family. Both tables only currently have two region each.
>>>>>>>>>
>>>>>>>>> The regionserver that serves the indexed table's regions is using
>>>>>>>>> the
>>>>>>>>> most CPU but is 87% idle. The other servers are all at ~90% idle.
>>>>>
>>>>> There
>>>>>>>>>
>>>>>>>>> is no IO wait. the perl processes are barely ticking over. Java o=
n
>>>>>>>>> the
>>>>>>>>> most "loaded" server is using about 50-60% of one CPU.
>>>>>>>>>
>>>>>>>>> Normally when I do load in a pseudo-distrbuted hbase (my
>>>>>>>>> development
>>>>>>>>> platform) perl's speed is the limiting factor and uses about 85% =
of
>>>>>>>>> a
>>>>>>>>> CPU. In this cluster they are using only 5-10% of a CPU as they a=
re
>>>>>
>>>>> all
>>>>>>>>>
>>>>>>>>> waiting on thrift (hbase). When I run only 1 process on the
>>>>>>>>> cluster,
>>>>>>>>> perl uses much more of a CPU, maybe 70%.
>>>>>>>>>
>>>>>>>>> Any tips or help in getting the speed/scalability up would be
>>>>>>>>> great.
>>>>>>>>> Please let me know if you need any other info.
>>>>>>>>>
>>>>>>>>> As I send this - it looks like the main table has split again and
>>>>>>>>> is
>>>>>>>>> being served by three regionservers.. My performance is going up =
a
>>>>>>>>> bit
>>>>>>>>> (now 35 rows/sec/table per processes), but still seems like I'm n=
ot
>>>>>>>>> using the full potential of even the limited EC2 system, no IO wa=
it
>>>>>
>>>>> and
>>>>>>>>>
>>>>>>>>> lots of idle CPU.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> many thanks
>>>>>>>>> -chris
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Todd Lipcon
>>>> Software Engineer, Cloudera
>>>
>>>
>