hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shuai Lin <linshuai2...@gmail.com>
Subject Re: Region Server OutOfMemory Error
Date Thu, 08 Jan 2015 02:29:55 GMT
Cool, will take a look. Thanks!

On Thu, Jan 8, 2015 at 3:26 AM, Otis Gospodnetic <otis.gospodnetic@gmail.com
> wrote:

> Hi,
>
> You can get graphs like this (+ alerts, anomaly detection, events, etc.)
> from SPM: http://sematext.com/spm
>
> HBase 0.98 metrics coming later this month.
>
> Otis
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Tue, Jan 6, 2015 at 11:42 PM, Shuai Lin <linshuai2012@gmail.com> wrote:
>
> > Cool, how can I get a graph like that?
> >
> > On Wed, Jan 7, 2015 at 4:06 AM, Otis Gospodnetic <
> > otis.gospodnetic@gmail.com
> > > wrote:
> >
> > > Hi,
> > >
> > > The first thing I'd want to know is which memory poor is getting
> filled.
> > > There are several in the JVM.
> > > Here's an example: https://apps.sematext.com/spm-reports/s/kZgBWLsJRd
> > > (this
> > > one is actually from an HBase cluster).  If you see any of the lines at
> > > 100% that's potential trouble.  If it stays at 100% it's trouble (i.e.
> > OOM
> > > about to happen).  If it's constantly close to 100% that's OOM waiting
> to
> > > happen and you should check your GC and CPU graphs and see how much
> time
> > > the JVM is spending on GC.
> > >
> > > Once you know which pool is problematic you'll be better informed and
> may
> > > be able to increase the size of just that pool.
> > >
> > > Otis
> > > --
> > > Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> > > Solr & Elasticsearch Support * http://sematext.com/
> > >
> > >
> > > On Tue, Jan 6, 2015 at 6:32 AM, Shuai Lin <linshuai2012@gmail.com>
> > wrote:
> > >
> > > > Hi all,
> > > >
> > > > We have a hbase cluster of 5 region servers, each, each hosting 60+
> > > > regions.
> > > >
> > > > But under heavy load the region servers crashes for OOME now and
> then:
> > > >
> > > > #
> > > > # java.lang.OutOfMemoryError: Java heap space
> > > > # -XX:OnOutOfMemoryError="kill -9 %p"
> > > > #   Executing /bin/sh -c "kill -9 16820"...
> > > >
> > > > We have max heap size set to 22GB (-Xmx22528m) for each RS, and uses
> > the
> > > > G1GC (-XX:+UseG1GC). To debug the problem we have turned on the jvm
> GC
> > > > log.  The last few lines of the GC log before each crash are always
> > like
> > > > this:
> > > >
> > > > 2015-01-06T11:10:19.087+0000: 5035.720: [Full GC 7122M->5837M(21G),
> > > > 0.8867660 secs]
> > > >    [Eden: 1024.0K(7278.0M)->0.0B(8139.0M) Survivors: 68.0M->0.0B
> Heap:
> > > > 7122.7M(22.0G)->5837.2M(22.0G)]
> > > >  [Times: user=1.42 sys=0.00, real=0.89 secs]
> > > > 2015-01-06T11:10:19.976+0000: 5036.608: [Full GC 5837M->5836M(21G),
> > > > 0.6378260 secs]
> > > >    [Eden: 0.0B(8139.0M)->0.0B(8139.0M) Survivors: 0.0B->0.0B Heap:
> > > > 5837.2M(22.0G)->5836.5M(22.0G)]
> > > >  [Times: user=0.93 sys=0.00, real=0.63 secs]
> > > >
> > > > From the last lineI see the heap only occupies 5837MB, and the
> capacity
> > > is
> > > > 22GB, so how can the OOM happen? Or is my interpretation of the gc
> log
> > > > wrong?
> > > >
> > > > I read some articles and onlhy got some basic concept of G1GC. I've
> > tried
> > > > tools like GCViewer, but none gives me useful explanation of the
> > details
> > > of
> > > > the GC log.
> > > >
> > > >
> > > > Regards,
> > > > Shuai
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message