hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: RAM Problems - Keeps Crashing
Date Tue, 03 Jan 2012 16:39:19 GMT
On Wed, Dec 28, 2011 at 6:27 AM, Seraph Imalia <seraph@eisp.co.za> wrote:
> After updating from 0.20.6 to 0.90.4, we have been having serious RAM issues.  I had
hbase-env.sh set to use 3 Gigs of RAM with 0.20.6 but with 0.90.4 even 4.5 Gigs seems not
enough.  It does not matter how much load the hbase services are under, it just crashes after
24-48 hours.

What kind of a 'crash' is it?  Is it OOME, or JVM seg faulting or just
a full GC making the RS look like its gone away?


>  The only difference the load makes is how quickly the services crash.  Even over this
holiday season with our lowest load of the year, it crashes just after 36 hours of being started.
 To fix it, I have to run the stop-hbase.sh command, wait a while and kill -9 any hbase processes
that have stopped outputting logs or stopped responding, and then run start-hbase.sh again.
>

The process is deadlocked?   IIRC, 0.90.4 had a possible deadlock.
You could try 0.90.5.

I took a look at some of the logs.  They do not run from server start
because i do not see the ulimit output in there.  I'd like to see
that.

Looking at dynobuntu10, I see some interesting 'keys':

2011-12-28 15:25:53,297 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Received request
to open region:
UrlIndex,http://www.hellopeter.com/write_report_preview.php?inclination=1&company=kalahari.net&countryid=168&location=cape
town&industryid=14&person=&problem=out of stock&other=&headline=why
advertise goods online and you cannot deliver%29&incident=i purchased
goods online that were supposedly in stock on the 5th october. 2010.
after numerous phone calls i was promised that i would receive the
ordered goods on the 20th october 2010. this has not happened to date.
i spoke with them today and they promised to answer my queries on
21st october2010. how can you run a online busines ans sell %22we dont
have stock%22%3a this is the easy way out as we have no proof of
that%0d%0ait is just common curtousy to return a phone call. they have
had my money in their bank account for 15 days. this seems like a
****. they could be reaping interest on thousands of peoples money.
easy way of making money.%0d%0akalahari. net are in a comfort zone.
they need to realize that customers are king%0d%0athey reimburse my
money. i paid bank charges and transfer fees. what about this. my
unnessessary phone calls. do they reinburse this.%0d%0acome on stop
taking the innocent public for a ride with your sweet
talk.&incidentcharsleft=270&incident_day_select=21&incident_month_select=10&incident_year_select=2010&incident_hour_select=11&incident_min_select=45&incident_ampm_select=pm&policyno=3573210
%2f3573310 &cellno=%2b27
766881896&preview=preview,1308921597915.1827414390

Thats a single key.  It looks like you have an issue in your crawler's
url extraction facility.

If you have lots of URLs like the above, my guess is that you have
massive indices.  Look at a regionserver and see how much RAM the
indexes take up?

In dynoubuntu12 I see an OOME.  Interestingly, the OOME is while
trying to read in a file's index on:

2011-12-28 15:26:50,310 DEBUG
org.apache.hadoop.hbase.regionserver.HRegion: Opening region: REGION
=> {NAME => 'UrlIndex,http://media.imgbnr.com/images/prep_ct.php?imgfile=4327_567146_7571713_250_300.html&partnerid=113471&appid=35229&subid=&advertiserid=567146&keywordid=42825417&type=11&uuid=e11ac4bea82d42838fde8eb306fbc354&keyword=www.&matchedby=c&ct=cpi&wid=5008233&size=300x250&lid=7571713&cid=230614&cc=us&rc=in&mc=602&dc=0&vt=1275659190365&refurl=mangafox.com&clickdomain=66.45.56.124&pinfo=&rurl=http://javascript,1283006905877',
STARTKEY => 'http://media.imgbnr.com/images/prep_ct.php?imgfile=4327_567146_7571713_250_300.html&partnerid=113471&appid=35229&subid=&advertiserid=567146&keywordid=42825417&type=11&uuid=e11ac4bea82d42838fde8eb306fbc354&keyword=www.&matchedby=c&ct=cpi&wid=5008233&size=300x250&lid=7571713&cid=230614&cc=us&rc=in&mc=602&dc=0&vt=1275659190365&refurl=mangafox.com&clickdomain=66.45.56.124&pinfo=&rurl=http://javascript',
ENDKEY => 'http://media.imgbnr.com/images/prep_ct.php?imgfile=6966_567146_7571715_90_728.html&partnerid=113474&appid=35224&subid=&advertiserid=567146&keywordid=42825616&type=11&uuid=6178294088f545ab938c403be5b7c957&keyword=www.&matchedby=c&ct=cpi&wid=5008236&size=728x90&lid=7571715&cid=230615&cc=us&rc=ny&mc=501&dc=0&vt=1275772980357&refurl=worldstarhiphop.com&clickdomain=66.45.56.124&pinfo=&rurl=http://javascript',
ENCODED => 1246560666, TABLE => {{NAME => 'UrlIndex', INDEXES =>
'
Mime
View raw message