hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Seraph Imalia <ser...@eisp.co.za>
Subject Re: Hbase pausing problems
Date Wed, 20 Jan 2010 17:37:55 GMT



> From: stack <stack@duboce.net>
> Reply-To: <hbase-user@hadoop.apache.org>
> Date: Wed, 20 Jan 2010 07:26:58 -0800
> To: <hbase-user@hadoop.apache.org>
> Subject: Re: Hbase pausing problems
> 
> On Wed, Jan 20, 2010 at 1:06 AM, Seraph Imalia <seraph@eisp.co.za> wrote:
> 
>> 
>> The client stops being able to write to hBase as soon as 1 of the
>> regionservers starts doing this...
>> 
>> 2010-01-17 01:16:25,729 INFO
>> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of
>> ChannelDelivery,5352f559-d68e-42e9-be92-8bae82185ed1,1262544772804 because
>> global memstore limit of 396.7m exceeded; currently 396.7m and flushing
>> till
>> 247.9m
>> 
>> See hbase.regionserver.global.memstore.upperLimit and
> hbase.regionserver.global.memstore.lowerLimit.  The former is a prophylactic
> against OOME'ing.  The sum of all memory used by MemStores is not allowed to
> grow beyond 0.4 of total heap size (0.4 is default).  The 247.9M figure in
> the above is 0.25 of the heap by default.  Writes are held up until
> sufficient MemStore space has been dumped by flushing.  You seem to be
> taking on writes at a rate that is in excess of the rate at which you can
> flush.  We'll take a lok at your logs..... You might up the 0.25 to 0.3 or
> 0.32.  This will shorten the times we stop taking on writes but at the cost
> of increasing the number of times we disallow writes.

Does this mean that when 1 regionserver does a memstore flush, the other two
regionservers are also unavailable for writes?  I have watched the logs
carefully to make sure that not all the regionservers are flushing at the
same time.  Most of the time, only 1 server flushes at a time and in rare
cases, I have seen two at a time.

> 
> It also looks like you have little RAM space given over to hbase, just 1G?
> If your traffic is bursty, giving hbase more RAM might help it get over
> these write humps.

I have it at 1G on purpose.  When we first had the problem, I immediately
thought the problem was resource related, so I increased the hBase RAM to 3G
(each server has 8G - I was carefull to watch for swapping).  This made the
problem worse because each memstore flush took longer which stopped writing
for longer and people started noticing that our system was down during those
periods.  Granted, the period between flushes was longer, but the effect was
that people started to notice our downtime.  So I have put the RAM back down
to 1G to minimize the negative effects on the live system and less people
notice it.  


> 
> 
> 
>> Or this...
>> 
>> 2010-01-17 01:16:26,159 INFO
>> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of
>> AdDelivery,613a401d-fb8a-42a9-aac6-d957f6281035,1261867806692 because
>> global
>> memstore limit of 396.7m exceeded; currently 390.4m and flushing till
>> 247.9m
>> 
>> This is a by-product of the above hitting 'global limit'.
> 
> 
> 
>> And then as soon as it finishes that, it starts doing this...
>> 
>> 2010-01-17 01:16:36,709 DEBUG
>> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction
>> requested for region
>> AdDelivery,fb98f6c9-db13-4853-92ee-ffe1182fffd0,1263544763046/350999600
>> because: regionserver/192.168.2.88:60020.cacheFlusher
>> 
>> These are 'normal'  We are logging fact that a compaction has been
> requested on a region.  This does not get in the way of our taking on writes
> (not directly).
> 
> 
> 
>> And as soon as it has finished the last of the Compaction Requests, the
>> client recovers and the regionserver starts doing this...
>> 
>> 2010-01-17 01:16:36,713 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> Compaction size of ChannelDelivery_Family: 209.5m; Skipped 1 file(s), size:
>> 216906650
>> 2010-01-17 01:16:36,713 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> Started compaction of 3 file(s)  into
>> /hbase/ChannelDelivery/compaction.dir/165262792, seqid=1241653592
>> 2010-01-17 01:16:37,143 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> Completed compaction of ChannelDelivery_Family; new storefile is
>> 
>> hdfs://dynobuntu6:8020/hbase/ChannelDelivery/165262792/ChannelDelivery_Famil
>> y/1673693545539520912; store size is 209.5m
>> 
> 
> Above is 'normal'.  At DEBUG you see detail on hbase going about its
> business.
> 
> 
>> 
>> All of these logs seem perfectly acceptable to me - the problem is that it
>> just requires one of the regionservers to start doing this for the client
>> to
>> be prevented from inserting new rows into hBase.  The logs don't seem to
>> explain why this is happening.
>> 
>> 
> Clients will be blocked writing regions carried by the effected regionserver
> only.  Your HW is not appropriate to the load as currently setup.  You might
> also consider adding more machines to your cluster.
>

Hmm... How does hBase decide which region to write to?  Is it possible that
hBase is deciding to write all our current records to one specific region
that happens to be on the server that is busy doing a memstore flush?

We are currently inserting about 6 million rows per day.  SQL Server (which
I am so happy to no longer be using for this) was able to write (and
replicate to a slave) 9 million records (using the same spec'ed server).  I
would like to see hBase cope with the 3 we have given it at least when
inserting 6 million.  Do you think this is possible or is our only answer to
throw on more servers?

Seraph
 
> St.Ack
> 
> 
> 
>> Thank you for your assistance thus far; please let me know if you need or
>> discover anything else?
>> 
>> Regards,
>> Seraph
>> 
>> 
>> 
>>> From: Jean-Daniel Cryans <jdcryans@apache.org>
>>> Reply-To: <hbase-user@hadoop.apache.org>
>>> Date: Mon, 18 Jan 2010 09:49:16 -0800
>>> To: <hbase-user@hadoop.apache.org>
>>> Subject: Re: Hbase pausing problems
>>> 
>>> The next step would be to take a look at your region server's log
>>> around the time of the insert and clients who don't resume after the
>>> loss of a region server. If you are able to gzip them and put them on
>>> a public server, it would be awesome.
>>> 
>>> Thx,
>>> 
>>> J-D
>>> 
>>> On Mon, Jan 18, 2010 at 1:03 AM, Seraph Imalia <seraph@eisp.co.za>
>> wrote:
>>>> Answers below...
>>>> 
>>>> Regards,
>>>> Seraph
>>>> 
>>>>> From: stack <stack@duboce.net>
>>>>> Reply-To: <hbase-user@hadoop.apache.org>
>>>>> Date: Fri, 15 Jan 2010 10:10:39 -0800
>>>>> To: <hbase-user@hadoop.apache.org>
>>>>> Subject: Re: Hbase pausing problems
>>>>> 
>>>>> How many CPUs?
>>>> 
>>>> 1x Quad Xeon in each server
>>>> 
>>>>> 
>>>>> You are using default JVM settings (see HBASE_OPTS in hbase-env.sh).
>>  You
>>>>> might want to enable GC logging.  See the line after hbase-env.sh.
>>  Enable
>>>>> it.  GC logging might tell you about the pauses you are seeing.
>>>> 
>>>> I will enable GC Logging tonight during our slow time because restarting
>> the
>>>> regionservers causes the clients to pause indefinitely.
>>>> 
>>>>> 
>>>>> Can you get a fourth server for your cluster and run the master, zk,
>> and
>>>>> namenode on it and leave the other three servers for regionserver and
>>>>> datanode (with perhaps replication == 2 as per J-D to lighten load on
>> small
>>>>> cluster).
>>>> 
>>>> We plan to double the number of servers in the next few weeks and I will
>>>> take your advice to put the master, zk and namenode on it (we will need
>> to
>>>> have a second one on standby should this one crash).  The servers will
>> be
>>>> ordered shortly and will be here in a week or two.
>>>> 
>>>> That said, I have been monitoring CPU usage and none of them seem
>>>> particularly busy.  The regionserver on each one hovers around 30% all
>> the
>>>> time and the datanode sits at about 10% most of the time.  If we do have
>> a
>>>> resource issue, it definitely does not seem to be CPU.
>>>> 
>>>> Increasing RAM did not seem to work either - it just made hBase use a
>> bigger
>>>> memstore and then it took longer to do a flush.
>>>> 
>>>> 
>>>>> 
>>>>> More notes inline in below.
>>>>> 
>>>>> On Fri, Jan 15, 2010 at 1:33 AM, Seraph Imalia <seraph@eisp.co.za>
>> wrote:
>>>>> 
>>>>>> Approximately every 10 minutes, our entire coldfusion system pauses
at
>> the
>>>>>> point of inserting into hBase for between 30 and 60 seconds and then
>>>>>> continues.
>>>>>> 
>>>>>> Yeah, enable GC logging.  See if you can make correlation between
the
>> pause
>>>>> the client is seeing and a GC pause.
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> Investigation...
>>>>>> 
>>>>>> Watching the logs of the regionserver, the pausing of the coldfusion
>> system
>>>>>> happens as soon as one of the regionservers starts flushing the
>> memstore
>>>>>> and
>>>>>> recovers again as soon as it is finished flushing (recovers as soon
as
>> it
>>>>>> starts compacting).
>>>>>> 
>>>>> 
>>>>> 
>>>>> ...though, this would seem to point to an issue with your hardware.
>>  How
>>>>> many disks?  Are they misconfigured such that they hold up the system
>> when
>>>>> they are being heavily written to?
>>>>> 
>>>>> 
>>>>> A regionserver log at DEBUG from around this time so we could look at
>> it
>>>>> would be helpful.
>>>>> 
>>>>> 
>>>>> I can recreate the error just by stopping 1 of the regionservers; but
>> then
>>>>>> starting the regionserver again does not make coldfusion recover
until
>> I
>>>>>> restart the coldfusion servers.  It is important to note that if
I
>> keep the
>>>>>> built in hBase shell running, it is happily able to put and get data
>> to and
>>>>>> from hBase whilst coldfusion is busy pausing/failing.
>>>>>> 
>>>>> 
>>>>> This seems odd.  Enable DEBUG for the client-side.  Do you see the
>> shell
>>>>> recalibrating finding new locations for regions after you shutdown the
>>>>> single regionserver, something that your coldfusion is not doing?  Or,
>>>>> maybe, the shell is putting a regionserver that has not been disturbed
>> by
>>>>> your start/stop?
>>>>> 
>>>>> 
>>>>>> 
>>>>>> I have tried increasing the regionserver¹s RAM to 3 Gigs and this
just
>> made
>>>>>> the problem worse because it took longer for the regionservers to
>> flush the
>>>>>> memory store.
>>>>> 
>>>>> 
>>>>> Again, if flushing is holding up the machine, if you can't write a file
>> in
>>>>> background without it freezing your machine, then your machines are
>> anemic
>>>>> or misconfigured?
>>>>> 
>>>>> 
>>>>>> One of the links I found on your site mentioned increasing
>>>>>> the default value for hbase.regionserver.handler.count to 100 ­
this
>> did
>>>>>> not
>>>>>> seem to make any difference.
>>>>> 
>>>>> 
>>>>> Leave this configuration in place I'd say.
>>>>> 
>>>>> Are you seeing 'blocking' messages in the regionserver logs?
>>  Regionserver
>>>>> will stop taking on writes if it thinks its being overrun to prevent
>> itself
>>>>> OOME'ing.  Grep the 'multiplier' configuration in hbase-default.xml.
>>>>> 
>>>>> 
>>>>> 
>>>>>> I have double checked that the memory flush
>>>>>> very rarely happens on more than 1 regionserver at a time ­ in fact
in
>> my
>>>>>> many hours of staring at tails of logs, it only happened once where
>> two
>>>>>> regionservers flushed at the same time.
>>>>>> 
>>>>>> You've enabled DEBUG?
>>>>> 
>>>>> 
>>>>> 
>>>>>> My investigations point strongly towards a coding problem on our
side
>>>>>> rather
>>>>>> than a problem with the server setup or hBase itself.
>>>>> 
>>>>> 
>>>>> If things were slow from client-perspective, that might be a
>> client-side
>>>>> coding problem but these pauses, unless you have a fly-by deadlock in
>> your
>>>>> client-code, its probably an hbase issue.
>>>>> 
>>>>> 
>>>>> 
>>>>>>  I say this because
>>>>>> whilst I understand why a regionserver would go offline during a
>> memory
>>>>>> flush, I would expect the other two regionservers to pick up the
load
>> ­
>>>>>> especially since the built-in hbase shell has no problem accessing
>> hBase
>>>>>> whilst a regionserver is busy doing a memstore flush.
>>>>>> 
>>>>>> HBase does not go offline during memory flush.  It continues to be
>>>>> available for reads and writes during this time.  And see J-D response
>> for
>>>>> incorrect understanding of how loading of regions is done in an hbase
>>>>> cluster.
>>>>> 
>>>>> 
>>>>> 
>>>>> ...
>>>>> 
>>>>> 
>>>>> I think either I am leaving out code that is required to determine
>> which
>>>>>> RegionServers are available OR I am keeping too many hBase objects
in
>> RAM
>>>>>> instead of calling their constructors each time (my purpose obviously
>> was
>>>>>> to
>>>>>> improve performance).
>>>>>> 
>>>>>> 
>>>>> For sure keep single instance of HBaseConfiguration at least and use
>> this
>>>>> constructing all HTable and HBaseAdmin instances.
>>>>> 
>>>>> 
>>>>> 
>>>>>> Currently the live system is inserting over 7 Million records per
day
>>>>>> (mostly between 8AM and 10PM) which is not a ridiculously high load.
>>>>>> 
>>>>>> 
>>>>> What size are the records?   What is your table schema?  How many
>> regions do
>>>>> you currently have in your table?
>>>>> 
>>>>>  St.Ack
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>> 
>> 
>> 
>> 
>> 






Mime
View raw message