hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis.gospodne...@gmail.com>
Subject Re: Is this a long GC pause, or something else?
Date Wed, 11 Jun 2014 04:10:37 GMT
Hi Tom,

Aha.  Our pauses keep happening. :(

We use SPM - see http://sematext.com/spm/ - it has support for HBase and
Hadoop metrics, among other things.  As a matter of fact, for
troubleshooting an issue like this one you may also want to ship your logs
into Logsene <http://sematext.com/logsene/>.  Doing that will let you
correlate your pause with messages in the logs, which could help you figure
out what's going on next time something like this happens.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Tue, Jun 10, 2014 at 7:52 PM, Tom Brown <tombrown52@gmail.com> wrote:

> Otis,
>
> I'm not sure our issue is the same (although they could turn out to be
> related). As far as I have been able to determine, we have only had a
> single long pause.
>
> However, we don't have much experience micromanaging our JVMs. How did you
> generate those graphs?
>
> --Tom
>
>
> On Tue, Jun 10, 2014 at 4:52 PM, Otis Gospodnetic <
> otis.gospodnetic@gmail.com> wrote:
>
> > No, I don't think so.  We had it until this morning and didn't see this
> > problem.  We'll probably switch to it tomorrow morning before we change
> EC2
> > instances and see if that removes the problem.
> >
> > Tom - do your pauses look like the ones in our SPM graphs?
> >
> > Otis
> > --
> > Performance Monitoring * Log Analytics * Search Analytics
> > Solr & Elasticsearch Support * http://sematext.com/
> >
> >
> > On Tue, Jun 10, 2014 at 6:38 PM, Vladimir Rodionov <
> > vrodionov@carrieriq.com>
> > wrote:
> >
> > > Unbelievable. Do you see the same with the latest OpenJDK?
> > >
> > > Best regards,
> > > Vladimir Rodionov
> > > Principal Platform Engineer
> > > Carrier IQ, www.carrieriq.com
> > > e-mail: vrodionov@carrieriq.com
> > >
> > > ________________________________________
> > > From: Otis Gospodnetic [otis.gospodnetic@gmail.com]
> > > Sent: Tuesday, June 10, 2014 2:43 PM
> > > To: user@hbase.apache.org
> > > Subject: Re: Is this a long GC pause, or something else?
> > >
> > > Does it repeat?
> > > We are seeing this with u60 oracle JVM too!  SPM shows the whole JVM
> > > blocking for about 16 minutes every M minutes.
> > >
> > > Otis
> > >
> > >
> > >
> > > > On Jun 10, 2014, at 2:05 PM, Tom Brown <tombrown52@gmail.com> wrote:
> > > >
> > > > Last night a regionserver in my cluster stopped responding in a
> timely
> > > > manner for about 20 minutes. I know that stop-the-world GC can cause
> > this
> > > > type of behavior, but 20 minutes seems excessive.
> > > >
> > > > The server is a 2 core VM with 16GB of RAM, (hbase max heap is 12GB).
> > We
> > > > are using the latest java 7 from oracle. HDFS is provided by an
> Isilon
> > > > cluster.
> > > >
> > > > The server workload is read/write: the writing process reads all rows
> > it
> > > is
> > > > about to write, updates them if they exist, and then writes all the
> > rows
> > > > (replacing ones that were updated).
> > > >
> > > > The last messages before the pause were regarding an HLog roll:
> > > >
> > > > DEBUG org.apache.hadoop.hbase.regionserver.LogRoller: HLog roll
> > requested
> > > > INFO org.apache.hadoop.hbase.util.FSUtils: FileSystem doesn't support
> > > > getDefaultReplication
> > > > INFO org.apache.hadoop.hbase.util.FSUtils: FileSystem doesn't support
> > > > getDefaultBlockSize
> > > >
> > > > During the next 20 minutes there were a handful of sporadic
> > LruBlockCache
> > > > stats messages but nothing else. After 20 minutes, normal operation
> > > resumed.
> > > >
> > > > Is 20 minutes for a GC pause expected given the operational load and
> > > > machine specs? Could a GC pause include periodic log messages? If it
> > > wasn't
> > > > a GC pause, what else could it be?
> > > >
> > > > --Tom
> > >
> > > Confidentiality Notice:  The information contained in this message,
> > > including any attachments hereto, may be confidential and is intended
> to
> > be
> > > read only by the individual or entity to whom this message is
> addressed.
> > If
> > > the reader of this message is not the intended recipient or an agent or
> > > designee of the intended recipient, please note that any review, use,
> > > disclosure or distribution of this message or its attachments, in any
> > form,
> > > is strictly prohibited.  If you have received this message in error,
> > please
> > > immediately notify the sender and/or Notifications@carrieriq.com and
> > > delete or destroy any copy of this message and its attachments.
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message