hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Seraph Imalia <ser...@eisp.co.za>
Subject Re: RAM Problems - Keeps Crashing
Date Thu, 29 Dec 2011 15:56:48 GMT
Thanks,

I will try disabling it to see if the memory is being taken up by MSLAB.

Regards,
Seraph

On 29 Dec 2011, at 5:47 PM, Ted Yu wrote:

> mslab was introduced after 0.20.6
> 
> Read Todd's series:
> http://www.cloudera.com/blog/2011/03/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-3/
> 
> Cheers
> 
> On Thu, Dec 29, 2011 at 12:19 AM, Seraph Imalia <seraph@eisp.co.za> wrote:
> 
>> Region Servers
>> 
>> Address                         Start Code              Load
>> dynobuntu10:60030       1325081250180   requests=43, regions=224,
>> usedHeap=3946, maxHeap=4087
>> dynobuntu12:60030       1325081249966   requests=32, regions=224,
>> usedHeap=3821, maxHeap=4087
>> dynobuntu17:60030       1325081248407   requests=39, regions=225,
>> usedHeap=4016, maxHeap=4087
>> Total:  servers: 3              requests=114, regions=673
>> 
>> I restarted them yesterday and the number of regions increased from 667 to
>> 673 and they are about to run out of heap again :(.  Should I set that
>> property to false? - what does mslab do? - is it new after 0.20.6?
>> 
>> Regards,
>> Seraph
>> 
>> On 28 Dec 2011, at 5:46 PM, Ted Yu wrote:
>> 
>>> Can you tell me how many regions each region server hosts ?
>>> 
>>> In 0.90.4 there is this parameter:
>>> <name>hbase.hregion.memstore.mslab.enabled</name>
>>> <value>true</value>
>>> mslab tends to consume heap if region count is high.
>>> 
>>> Cheers
>>> 
>>> On Wed, Dec 28, 2011 at 6:27 AM, Seraph Imalia <seraph@eisp.co.za>
>> wrote:
>>> 
>>>> Hi Guys,
>>>> 
>>>> After updating from 0.20.6 to 0.90.4, we have been having serious RAM
>>>> issues.  I had hbase-env.sh set to use 3 Gigs of RAM with 0.20.6 but
>> with
>>>> 0.90.4 even 4.5 Gigs seems not enough.  It does not matter how much load
>>>> the hbase services are under, it just crashes after 24-48 hours.  The
>> only
>>>> difference the load makes is how quickly the services crash.  Even over
>>>> this holiday season with our lowest load of the year, it crashes just
>> after
>>>> 36 hours of being started.  To fix it, I have to run the stop-hbase.sh
>>>> command, wait a while and kill -9 any hbase processes that have stopped
>>>> outputting logs or stopped responding, and then run start-hbase.sh
>> again.
>>>> 
>>>> Attached are my logs from the latest "start-to-crash".  There are 3
>>>> servers and hbase is being used for storing URL's - 7 client servers
>>>> connect to hbase and perform URL Lookups at about 40 requests per second
>>>> (this is the low load over this holiday season).  If the URL does not
>>>> exist, it gets added.  The Key on the HTable is the URL and there are a
>> few
>>>> fields stored against it - e.g. DateDiscovered, Host, Script,
>> QueryString,
>>>> etc.
>>>> 
>>>> Each server has a hadoop datanode and an hbase regionserver and 1 of the
>>>> servers additionally has the namenode, master and zookeeper.  On first
>>>> start, each regionserver uses 2 Gigs (usedHeap) and as soon as I restart
>>>> the clients, the usedHeap slowly climes until it reaches the maxHeap and
>>>> shortly after that, the regionservers start crashing - sometimes they
>>>> actually shutdown gracefully by themselves.
>>>> 
>>>> Originally, we had hbase.regionserver.handler.count set to 100 and I
>> have
>>>> now removed that to leave it as default which has not helped.
>>>> 
>>>> We have not made any changes to the clients and we have a mirrored
>>>> instance of this in our UK Data Centre which is still running 0.20.6 and
>>>> servicing 10 clients currently at over 300 requests per second (again
>> low
>>>> load over the holidays) and it is 100% stable.
>>>> 
>>>> What do I do now? - your website says I cannot downgrade?
>>>> 
>>>> Please help
>>>> 
>>>> Regards,
>>>> Seraph
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>> 
>> 


Mime
View raw message