hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Buckley,Ron" <buckl...@oclc.org>
Subject RE: Region Server hung during shutdown after StackOverflow error
Date Fri, 30 May 2014 16:38:56 GMT
Thanks Ted. I should have seen that. 

I finally had to 'kill -9' the rs, as I couldnt get it to shut down any other way.

It seems like, the Region Server shouldnt have kept telling ZooKeeper that all was well, even
though it was trying to abort with a fatal error.

-----Original Message-----
From: Ted Yu [mailto:yuzhihong@gmail.com] 
Sent: Friday, May 30, 2014 12:11 PM
To: user@hbase.apache.org
Subject: Re: Region Server hung during shutdown after StackOverflow error

Looking at the StackOverflowError in pastebin, the cause was too many calls to subList().
J-D fixed one similar bug in HBASE-10312

I searched for '\.subList(' in 0.94 codebase but haven't pinpointed which class was the source
of such calls.

Will dig deeper when I have time.


On Fri, May 30, 2014 at 8:24 AM, Buckley,Ron <buckleyr@oclc.org> wrote:

> Interesting case happened out dev HBase cluster overnight.  (We're 
> running HBase 0.94.15 from CDH 4.6.0)
> A region server took a StackOverflow error, it looks like during 
> during a minor compaction.
> The region server is trying to shut down with a Fatal, but is now hung 
> during shutdown.
> The particularly troublesome thing is that the RS is alive enough to 
> keep zookeeper happy.
> So, the regions arent moving off, but our apps cant get to them 
> because the RS is mostly dead.
> I put some of the details on pastebin.
> JStack -> http://pastebin.com/hnLtaG54 Outfile -> 
> http://pastebin.com/5F1UcGjg Logfile -> http://pastebin.com/TBL1YSZM
View raw message