Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DAF6E11BE7 for ; Tue, 10 Jun 2014 18:52:50 +0000 (UTC) Received: (qmail 76200 invoked by uid 500); 10 Jun 2014 18:52:49 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 76126 invoked by uid 500); 10 Jun 2014 18:52:49 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 76116 invoked by uid 99); 10 Jun 2014 18:52:49 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Jun 2014 18:52:49 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of vrodionov@carrieriq.com designates 204.235.122.16 as permitted sender) Received: from [204.235.122.16] (HELO obmail.carrieriq.com) (204.235.122.16) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Jun 2014 18:52:44 +0000 From: Vladimir Rodionov To: "user@hbase.apache.org" Date: Tue, 10 Jun 2014 11:49:01 -0700 Subject: RE: Is this a long GC pause, or something else? Thread-Topic: Is this a long GC pause, or something else? Thread-Index: Ac+E18qbdbk9f5vsTqmGMhVV2FnfywABNo/o Message-ID: References: , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org 1. Do you have GC logging enabled on your cluster? It does not look like GC= - pause to me but for future troubleshooting it is better to enable GC logging. 2. How large is your cluster? Did you check NN and DN logs as well? Are all= your nodes (RS and DN) up and running? No dead nodes? Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodionov@carrieriq.com ________________________________________ From: Tom Brown [tombrown52@gmail.com] Sent: Tuesday, June 10, 2014 11:13 AM To: user@hbase.apache.org Subject: Re: Is this a long GC pause, or something else? We are still using 0.94.10. We are looking at upgrading soon, but have not done so yet. --Tom On Tue, Jun 10, 2014 at 12:10 PM, Ted Yu wrote: > Which release are you using ? > > In 0.98+, there is JvmPauseMonitor. > > Cheers > > > On Tue, Jun 10, 2014 at 11:05 AM, Tom Brown wrote: > > > Last night a regionserver in my cluster stopped responding in a timely > > manner for about 20 minutes. I know that stop-the-world GC can cause th= is > > type of behavior, but 20 minutes seems excessive. > > > > The server is a 2 core VM with 16GB of RAM, (hbase max heap is 12GB). W= e > > are using the latest java 7 from oracle. HDFS is provided by an Isilon > > cluster. > > > > The server workload is read/write: the writing process reads all rows i= t > is > > about to write, updates them if they exist, and then writes all the row= s > > (replacing ones that were updated). > > > > The last messages before the pause were regarding an HLog roll: > > > > DEBUG org.apache.hadoop.hbase.regionserver.LogRoller: HLog roll request= ed > > INFO org.apache.hadoop.hbase.util.FSUtils: FileSystem doesn't support > > getDefaultReplication > > INFO org.apache.hadoop.hbase.util.FSUtils: FileSystem doesn't support > > getDefaultBlockSize > > > > During the next 20 minutes there were a handful of sporadic LruBlockCac= he > > stats messages but nothing else. After 20 minutes, normal operation > > resumed. > > > > Is 20 minutes for a GC pause expected given the operational load and > > machine specs? Could a GC pause include periodic log messages? If it > wasn't > > a GC pause, what else could it be? > > > > --Tom > > > Confidentiality Notice: The information contained in this message, includi= ng any attachments hereto, may be confidential and is intended to be read o= nly by the individual or entity to whom this message is addressed. If the r= eader of this message is not the intended recipient or an agent or designee= of the intended recipient, please note that any review, use, disclosure or= distribution of this message or its attachments, in any form, is strictly = prohibited. If you have received this message in error, please immediately= notify the sender and/or Notifications@carrieriq.com and delete or destroy= any copy of this message and its attachments.