Return-Path: Delivered-To: apmail-hbase-user-archive@www.apache.org Received: (qmail 58129 invoked from network); 14 Jan 2011 18:03:10 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 14 Jan 2011 18:03:10 -0000 Received: (qmail 42909 invoked by uid 500); 14 Jan 2011 18:03:09 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 42823 invoked by uid 500); 14 Jan 2011 18:03:07 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 42810 invoked by uid 99); 14 Jan 2011 18:03:06 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 Jan 2011 18:03:06 +0000 X-ASF-Spam-Status: No, hits=3.7 required=10.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of wav100@gmail.com designates 209.85.161.41 as permitted sender) Received: from [209.85.161.41] (HELO mail-fx0-f41.google.com) (209.85.161.41) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 Jan 2011 18:03:00 +0000 Received: by fxm12 with SMTP id 12so3018033fxm.14 for ; Fri, 14 Jan 2011 10:02:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=2/v7idl7vtcgMP1NcLIHTw7B4P5OilPig2vi5+Hgq5E=; b=WiaA9cF0QXWNuDLPQoohOvRYVeDxvHEX+1sop7ukW8SILwDRMSU88r9EOAYcnFwGn2 u80ir869XtPgoYZUscnelB8a/qajOSPEvf92149/4nUVgwxWJuCMuhfT6fgPpGsr81OT HOBIrcX2L64Kz28nBjqs5Q5buz0aGyjW/ClUo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=x+PG1WI8Ae8y3gO31AQzbwwYKqJe/N6yMVNTlGTXqv0ynMhhjpbpYpkZBgt0QOgUYI hi8E55NWsDarTvp5xm+vXT1+2i9d9RTzK1Or9ppVJ0jREyrkkPq3R96fUVlRYfh2C1sk eAPvn+hXEPHTHM5hgGu7C1E4yXU3WGtm9Uk8U= MIME-Version: 1.0 Received: by 10.223.106.195 with SMTP id y3mr1072929fao.131.1295028158757; Fri, 14 Jan 2011 10:02:38 -0800 (PST) Received: by 10.223.102.66 with HTTP; Fri, 14 Jan 2011 10:02:38 -0800 (PST) In-Reply-To: <5A76F6CE309AD049AAF9A039A39242820F661085@sc-mbx04.TheFacebook.com> References: <8B1DEDD7-0D89-440B-987E-FDFA710EBD5C@gmail.com> <5A76F6CE309AD049AAF9A039A39242820F661085@sc-mbx04.TheFacebook.com> Date: Fri, 14 Jan 2011 13:02:38 -0500 Message-ID: Subject: Re: Cluster Wide Pauses From: Wayne To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=0016e6d388cd607af90499d23db7 --0016e6d388cd607af90499d23db7 Content-Type: text/plain; charset=ISO-8859-1 I have not confirmed anything for sure, other than it does not cross tables (does not happen when 2 tables are being written to). The region server logs are filled with compactions, splits, and memstore flushes so during a pause everything looks like any other time to me. On Fri, Jan 14, 2011 at 12:29 PM, Jonathan Gray wrote: > These are a different kind of pause (those caused by blockingStoreFiles). > > This is HBase stepping in and actually blocking updates to a region because > compactions have not been able to keep up with the write load. It could > manifest itself in the same way but this is different than shorter pauses > caused by periodic offlining of regions during balancing and splits. > > Wayne, have you confirmed in your RegionServer logs that the pauses are > associated with splits or region movement, and that you are not seeing the > blocking store files issue? > > JG > > > -----Original Message----- > > From: cft@tarnas.org [mailto:cft@tarnas.org] On Behalf Of Christopher > > Tarnas > > Sent: Friday, January 14, 2011 7:29 AM > > To: user@hbase.apache.org > > Subject: Re: Cluster Wide Pauses > > > > I have been seeing similar problems and found by raising the > > hbase.hregion.memstore.block.multiplier > > to above 12 (default is two) and the hbase.hstore.blockingStoreFiles to > 16 I > > managed to reduce the frequency of the pauses during loads. My nodes are > > pretty beefy (48 GB of ram) so I had room to experiment. > > > > From what I understand that gave the regionservers more buffer before > > they had to halt the world to catch up. The pauses still happen but their > > impact is less now. > > > > -chris > > > > On Fri, Jan 14, 2011 at 8:34 AM, Wayne wrote: > > > > > We have not found any smoking gun here. Most likely these are region > > > splits on a quickly growing/hot region that all clients get caught > waiting for. > > > > > > > > > On Thu, Jan 13, 2011 at 7:49 AM, Wayne wrote: > > > > > > > Thank you for the lead! We will definitely look closer at the OS > logs. > > > > > > > > > > > > On Thu, Jan 13, 2011 at 6:59 AM, Tatsuya Kawano > > > > > > >wrote: > > > > > > > >> > > > >> Hi Wayne, > > > >> > > > >> > We are seeing some TCP Resets on all nodes at the same time, and > > > >> sometimes > > > >> > quite a lot of them. > > > >> > > > >> > > > >> Have you checked this article from Andrei and Cosmin? They had a > > > >> busy firewall to cause network blackout. > > > >> > > > >> http://hstack.org/hbase-performance-testing/ > > > >> > > > >> Maybe it's not your case but just for sure. > > > >> > > > >> Thanks, > > > >> > > > >> -- > > > >> Tatsuya Kawano (Mr.) > > > >> Tokyo, Japan > > > >> > > > >> > > > >> On Jan 13, 2011, at 4:52 AM, Wayne wrote: > > > >> > > > >> > We are seeing some TCP Resets on all nodes at the same time, and > > > >> sometimes > > > >> > quite a lot of them. We have yet to correlate the pauses to the > > > >> > TCP > > > >> resets > > > >> > but I am starting to wonder if this is partly a network problem. > > > >> > Does Gigabit Ethernet break down on high volume nodes? Do high > > > >> > volume nodes > > > >> use > > > >> > 10G or Infiniband? > > > >> > > > > >> > > > > >> > On Wed, Jan 12, 2011 at 1:52 PM, Stack wrote: > > > >> > > > > >> >> Jon asks that you describe your loading in the issue. Would you > > > >> >> mind doing so. Ted, stick up in the issue the workload and > > > >> >> configs. you are running if you don't mind. I'd like to try it > over here. > > > >> >> Thanks lads, > > > >> >> St.Ack > > > >> >> > > > >> >> > > > >> >> On Wed, Jan 12, 2011 at 9:03 AM, Wayne > > wrote: > > > >> >>> Added: https://issues.apache.org/jira/browse/HBASE-3438. > > > >> >>> > > > >> >>> On Wed, Jan 12, 2011 at 11:40 AM, Wayne > > wrote: > > > >> >>> > > > >> >>>> We are using 0.89.20100924, r1001068 > > > >> >>>> > > > >> >>>> We are seeing see it during heavy write load (which is all the > > > time), > > > >> >> but > > > >> >>>> yesterday we had read load as well as write load and saw both > > > >> >>>> reads > > > >> and > > > >> >>>> writes stop for 10+ seconds. The region size is the biggest > > > >> >>>> clue we > > > >> have > > > >> >>>> found from our tests as setting up a new cluster with a 1GB > > > >> >>>> max > > > >> region > > > >> >> size > > > >> >>>> and starting to load heavily we will see this a lot for long > > > >> >>>> long > > > >> time > > > >> >>>> frames. Maybe the bigger file gets hung up more easily with a > > > split? > > > >> >> Your > > > >> >>>> description below also fits in that early on the load is not > > > balanced > > > >> so > > > >> >> it > > > >> >>>> is easier to stop everything on one node as the balance is not > > > great > > > >> >> early > > > >> >>>> on. I will file a JIRA. I will also try to dig deeper into the > > > >> >>>> logs > > > >> >> during > > > >> >>>> the pauses to find a node that might be stuck in a split. > > > >> >>>> > > > >> >>>> > > > >> >>>> > > > >> >>>> On Wed, Jan 12, 2011 at 11:17 AM, Stack > > wrote: > > > >> >>>> > > > >> >>>>> On Tue, Jan 11, 2011 at 2:34 PM, Wayne > > wrote: > > > >> >>>>>> We have very frequent cluster wide pauses that stop all > > > >> >>>>>> reads and > > > >> >>>>> writes > > > >> >>>>>> for seconds. > > > >> >>>>> > > > >> >>>>> All reads and all writes? > > > >> >>>>> > > > >> >>>>> I've seen the pause too for writes. Its something I've > > > >> >>>>> always > > > meant > > > >> >>>>> to look into. Friso postulates one cause. Another that > > > >> >>>>> we've > > > >> talked > > > >> >>>>> of is a region taking a while to come back on line after a > > > >> >>>>> split > > > or > > > >> a > > > >> >>>>> rebalance for whatever reason. Client loading might be > 'random' > > > >> >>>>> spraying over lots of random regions but they all get stuck > > > waiting > > > >> on > > > >> >>>>> one particular region to come back online. > > > >> >>>>> > > > >> >>>>> I suppose reads could be blocked for same reason if all are > > > >> >>>>> trying > > > >> to > > > >> >>>>> read from the offlined region. > > > >> >>>>> > > > >> >>>>> What version of hbase are you using? Splits should be faster > > > >> >>>>> in > > > >> 0.90 > > > >> >>>>> now that the split daughters come up on the same region. > > > >> >>>>> > > > >> >>>>> Sorry I don't have a better answer for you. Need to dig in. > > > >> >>>>> > > > >> >>>>> File a JIRA. If you want to help out some, stick some data > > > >> >>>>> up in > > > >> it. > > > >> >>>>> Some suggestions would be to enable logging of when we > > lookup > > > region > > > >> >>>>> locations in client and then note when requests go to zero. > > > >> >>>>> Can > > > you > > > >> >>>>> figure what region the clients are waiting on (if they are > > > >> >>>>> waiting > > > >> on > > > >> >>>>> any). If you can pull out a particular one, try and elicit > > > >> >>>>> its history at time of blockage. Is it being moved or > > > >> >>>>> mid-split? I suppose it makes sense that bigger regions > > > >> >>>>> would make the > > > situation > > > >> >>>>> 'worse'. I can take a look at it too. > > > >> >>>>> > > > >> >>>>> St.Ack > > > >> >>>>> > > > >> >>>>> > > > >> >>>>> > > > >> >>>>> > > > >> >>>>> We are constantly loading data to this cluster of 10 nodes. > > > >> >>>>>> These pauses can happen as frequently as every minute but > > > sometimes > > > >> >> are > > > >> >>>>> not > > > >> >>>>>> seen for 15+ minutes. Basically watching the Region server > > > >> >>>>>> list > > > >> with > > > >> >>>>> request > > > >> >>>>>> counts is the only evidence of what is going on. All reads > > > >> >>>>>> and > > > >> writes > > > >> >>>>>> totally stop and if there is ever any activity it is on the > > > >> >>>>>> node > > > >> >> hosting > > > >> >>>>> the > > > >> >>>>>> .META. table with a request count of region count + 1. This > > > problem > > > >> >>>>> seems to > > > >> >>>>>> be worse with a larger region size. We tried a 1GB region > > > >> >>>>>> size > > > and > > > >> >> saw > > > >> >>>>> this > > > >> >>>>>> more than we saw actual activity (and stopped using a larger > > > region > > > >> >> size > > > >> >>>>>> because of it). We went back to the default region size and > > > >> >>>>>> it > > > was > > > >> >>>>> better, > > > >> >>>>>> but we had too many regions so now we are up to 512M for a > > > >> >>>>>> region > > > >> >> size > > > >> >>>>> and > > > >> >>>>>> we are seeing it more again. > > > >> >>>>>> > > > >> >>>>>> Does anyone know what this is? We have dug into all of the > > > >> >>>>>> logs > > > to > > > >> >> find > > > >> >>>>> some > > > >> >>>>>> sort of pause but are not able to find anything. Is this an > > > >> >>>>>> wal > > > >> hlog > > > >> >>>>> roll? > > > >> >>>>>> Is this a region split or compaction? Of course our biggest > > > >> >>>>>> fear > > > is > > > >> a > > > >> >> GC > > > >> >>>>>> pause on the master but we do not have java logging turned > > > >> >>>>>> on > > > with > > > >> >> the > > > >> >>>>>> master to tell. What could possibly stop the entire cluster > > > >> >>>>>> from > > > >> >> working > > > >> >>>>> for > > > >> >>>>>> seconds at a time very frequently? > > > >> >>>>>> > > > >> >>>>>> Thanks in advance for any ideas of what could be causing > this. > > > >> >>>>>> > > > >> >>>>> > > > >> >>>> > > > >> >>>> > > > >> >>> > > > >> >> > > > >> > > > >> > > > > > > > > --0016e6d388cd607af90499d23db7--