hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: Zookeeper session lost
Date Wed, 31 Mar 2010 15:23:00 GMT
Ok, well swapping, esp if combined with GC, can def. account for very 
long delays.

Not sure if anyone provided this before but take a look at the swapping 
section on the ZK troubleshooting page. That section, or perhaps one of 
the other sections on that page, might give you addl insight.

Good Luck,


Peter Falk wrote:
> We had 4GB head for the region server, on a machine with 8GB that was 
> also running a data node and a zoo keeper. We have tried with the 
> incremental garbage collector before, but had problem with a running 
> away heap size, resulting in swapping. We were/are running with 
> the parallel GC now. When the session expire problem occurred, we 
> noticed swapping on the node just before. Therefore, we are a bit afraid 
> to increase heap size more, or to try to incremental GC again. We are 
> not running in any virtualized environment.
> Thanks for the various responses, and the recommendations. I think it 
> would be nice with an option to automatically restart region server for 
> situations like this.
> TIA,
> Peter
> On Tue, Mar 30, 2010 at 18:25, Patrick Hunt <phunt@apache.org 
> <mailto:phunt@apache.org>> wrote:
>     Are you running in a virtualized environment by chance? (ec2,
>     vmware, etc...) vms, esp oversubscribed/overloaded vms, can result
>     in significant io/memory related performance problems.
>     Patrick
>     Peter Falk wrote:
>         Thanks Jean-Daniel. I was not clear about what we have already
>         tried, and we
>         have tried all that you recommend in the updated wiki page,
>         including uppin'
>         the zookeepers session timeout. The node was heavily loaded at
>         the time and
>         it seems the cluster was simply overloaded.
>         However, would it not be possible to automatically start the
>         region server
>         again and let it request new regions? Seems to be dangerous to
>         let region
>         servers die under heavy load like this, and increase the load
>         further on
>         remaining nodes...
>         Sincerely,
>         Peter
>         On Mon, Mar 29, 2010 at 19:38, Jean-Daniel Cryans
>         <jdcryans@apache.org <mailto:jdcryans@apache.org>>wrote:
>             We already had an entry in the wiki for this issue but it
>             wasn't super
>             explicit about what's happening, so I completely rewrote it
>             using the
>             logs from this thread. See
>             http://wiki.apache.org/hadoop/Hbase/Troubleshooting#A9
>             Also I created a jira about putting that link directly into
>             the "We
>             slept Xms, ..." message so that people can get some answers
>             quickly.
>             See https://issues.apache.org/jira/browse/HBASE-2388
>             J-D

View raw message