Return-Path: Delivered-To: apmail-hadoop-hbase-dev-archive@minotaur.apache.org Received: (qmail 4723 invoked from network); 8 Apr 2009 22:52:01 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 8 Apr 2009 22:52:01 -0000 Received: (qmail 4846 invoked by uid 500); 8 Apr 2009 22:52:00 -0000 Delivered-To: apmail-hadoop-hbase-dev-archive@hadoop.apache.org Received: (qmail 4808 invoked by uid 500); 8 Apr 2009 22:52:00 -0000 Mailing-List: contact hbase-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-dev@hadoop.apache.org Delivered-To: mailing list hbase-dev@hadoop.apache.org Delivered-To: moderator for hbase-dev@hadoop.apache.org Received: (qmail 93353 invoked by uid 99); 8 Apr 2009 22:41:26 -0000 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=NO_RDNS_DOTCOM_HELO,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=received:from:to:date:subject:thread-topic:thread-index: message-id:references:in-reply-to:accept-language: content-language:x-ms-has-attach:x-ms-tnef-correlator:acceptlanguage: content-type:content-transfer-encoding:mime-version; b=f0RWavSz5yfvH2/WSNMNNLMAdNLFAAmHrEiBEh4BuoHWghv2XHFnJbTFjsxfWS3h From: Benjamin Reed To: "zookeeper-dev@hadoop.apache.org" , "joey42+reply@gmail.com" , "hbase-dev@hadoop.apache.org" Date: Wed, 8 Apr 2009 15:40:04 -0700 Subject: RE: Preventing SessionExpired events Thread-Topic: Preventing SessionExpired events Thread-Index: Acm4jEhsMLuysKWqRoOBUrJ83shspQAAKk7H Message-ID: <6990D2A1CAF07E40A7CFE68A5FAAA1531AA9930A9E@SP1-EX07VS02.ds.corp.yahoo.com> References: <82b0992a0904081239i650c1044ked19a855debf5020@mail.gmail.com> <49DD0645.8070504@apache.org> <82b0992a0904081335j581f88bdwddac70bbdcc2ef29@mail.gmail.com>,<92eebe280904081347k5f5bd373j4f64a8bf58028e3d@mail.gmail.com> In-Reply-To: <92eebe280904081347k5f5bd373j4f64a8bf58028e3d@mail.gmail.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org I'm curious about your session scenario. If your servers can hang for X sec= onds without a problem and Y is your session timeout, why would you set Y <= X? In your case if you set your session timeout to 5 secs for example, but= your server can hang for 20 seconds doing GC, your clients cannot expect a= response of less than 20 seconds, so why don't you set your session timeou= t to 20 seconds? ben ________________________________________ From: Joey Echeverria [joey42@gmail.com] Sent: Wednesday, April 08, 2009 1:47 PM To: hbase-dev@hadoop.apache.org Cc: zookeeper-dev@hadoop.apache.org Subject: Re: Preventing SessionExpired events Nitay is correct about the native threads. Using the pure Java API, the garbage collector will occasionally pause other Java threads to do a full mark and sweep. Even switching to the concurrent collector only delays the problem. The issues is mixing a high throughput application (HBase) with a low latency library (Zookeeper). Systems like HBase live on relatively large numbers of short lived objects. You only key keys and values long enough for the Memcache to get full then you write all the data to HDFS and throw away the objects. You can patch around the issue with object pools, but ultimately you need to insulate zk from the GC pauses. In our experience, the best way to do that was a jni wrapper around the zk C api. Since the C api uses it's own posix threads, it's protected from the GC. In the system we wrote, we ended up using the Java api with a large session timeout for most everything, and used the jni code just for creating ephemeral nodes. -Joey On Wed, Apr 8, 2009 at 9:35 PM, Nitay wrote: > The default session timeout in HBase is currently 10 seconds. Bumping it = up > to 30 and 60 reduced SessionExpired exceptions, according to Andrew. I > believe Andrew did run it under jconsole. He was also tuning GC parameter= s. > He mentioned running using incremental garbage collector > (-XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode). He can provide more > details on all of this. > > My understanding with HBASE-1316 is that it solves the problem because th= e > ZooKeeper IO/hearbeat thread becomes an OS level thread which is not mana= ged > by Java. Hence, the GC does not starve it. Joey can comment here as he > developed the solution. > > There are three main components that use ZooKeeper in HBase are: client, > regionserver, and master. > > The client does not have ephemeral nodes so having something like > ZOOKEEPER-321 for it would be nice. It is currently read only. For now > recovering it by reinitializing the ZooKeeper handle is not a big deal. > > The bigger issue is with the master and regionserver, which do use epheme= ral > nodes. Recovering them is a bit tougher, and we'd like to prevent getting > SessionExpired as much as possible. > > On Wed, Apr 8, 2009 at 1:17 PM, Patrick Hunt wrote: > >> What are you running for a session timeout on your clients? >> >> Can you run with something like jvisualvm or jconsole, and watch the gc >> activity when the session timeouts occur? Might give you some insight. >> Have you tried one of the alternative GC's available in the VM? >> >> http://developer.amd.com/documentation/articles/pages/4EasyWaystodoJavaG= arbageCollectionTuning.aspx >> ie "Flags for Latency Applications" >> >> We are also working on the following jira: >> https://issues.apache.org/jira/browse/ZOOKEEPER-321 >> which will eliminate session expirations for clients w/o ephemerals. (is >> this the case for you?) >> >> Try turning on debug in your client, the client will spit out: >> LOG.debug("Got ping response for sessionid:0x" >> If you turn on trace logging in the server you should see session update= s >> there as well (c->server, which control session expiration). >> >> re HBASE-1316 - how does the jni c wrapper fix this? Isn't the code stil= l >> running w/in the same (vm) process? >> >> >> Unfortunately I can't think of anything else if it is the GC. Basically >> you'd have to increase the timeout or try another gc with lower latency. >> >> Perhaps Mahadev/Ben/Flavio might have insight... >> >> Patrick >> >> >> Nitay wrote: >> >>> Hey guys, >>> >>> We've recently replaced a few pieces of HBase's cluster management and >>> coordination with ZooKeeper. One of guys, Andrew Purtell, has a cluster >>> that >>> he throws a lot of load at. Andrew's cluster was getting a lot of >>> SessionExpired events which were causing some havoc. After some discuss= ion >>> on the hbase list and additional testing by Andrew (tweaking things lik= e >>> the >>> session timeout, quorum size, and GC used), we suspect the problem is t= hat >>> the Java GC is starving the ZooKeeper hearbeat thread from executing. >>> >>> There is a JIRA open on the matter where Joey suggests a solution that = has >>> worked for him: >>> >>> https://issues.apache.org/jira/browse/HBASE-1316 >>> >>> We wanted to loop you guys in to see if you have any thoughts/suggestio= ns >>> on >>> the matter. >>> >>> Thanks, >>> -n >>> >>> >