Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 82205 invoked from network); 17 Feb 2010 19:28:14 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 17 Feb 2010 19:28:14 -0000 Received: (qmail 77579 invoked by uid 500); 17 Feb 2010 19:28:13 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 77491 invoked by uid 500); 17 Feb 2010 19:28:13 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 77481 invoked by uid 99); 17 Feb 2010 19:28:13 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Feb 2010 19:28:13 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of mccarbc@gmail.com designates 209.85.221.172 as permitted sender) Received: from [209.85.221.172] (HELO mail-qy0-f172.google.com) (209.85.221.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Feb 2010 19:28:05 +0000 Received: by qyk2 with SMTP id 2so813357qyk.20 for ; Wed, 17 Feb 2010 11:27:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:content-type:mime-version :subject:from:in-reply-to:date:content-transfer-encoding:message-id :references:to:x-mailer; bh=7+rDyo7S60FHccXessq7w3LF2cbmzxMrJ8TqUaCYEG0=; b=V9O/NBVdPprp2aM7VnrQzr+zc69sxP9cv2FE4CSuRCCFAnA5Z9hYzQtKLXBOC49bbD Sv1Bfg00dxnW46Kc5yrPXaml+PFOm7wqH6JEYYKM3JTanh1PaL/A3N3NYsPeQFDDX3HC LlLyYCUpIPPusNAun0FhBiViH4jl2fKDqP5yc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer; b=vFvnj9KTAtrXatUts1+h54JiyTIY52yPw635rU2IWUR6sM+DbZeZl5wXH/u8SqAJds e9uKjyAEFH248fLCqzQx4l3s/5cgQaYq0f4vBc2D8nt/Pbowo3xI5fc2iRlyobmeoy8k iTRtNGc/t2CdV+BPQOQ3hZWWqRUXhvYMfbcHE= Received: by 10.224.13.145 with SMTP id c17mr4529484qaa.244.1266434864178; Wed, 17 Feb 2010 11:27:44 -0800 (PST) Received: from ?172.16.1.101? (adsl-63-205-250-44.dsl.snfc21.pacbell.net [63.205.250.44]) by mx.google.com with ESMTPS id 23sm6057919qyk.15.2010.02.17.11.27.42 (version=TLSv1/SSLv3 cipher=RC4-MD5); Wed, 17 Feb 2010 11:27:43 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1077) Subject: Re: Question regarding scalability of regionservers From: Brad McCarty In-Reply-To: <7c962aed1002162123t6993c99ha0a57ff945523a5a@mail.gmail.com> Date: Wed, 17 Feb 2010 11:27:40 -0800 Content-Transfer-Encoding: quoted-printable Message-Id: <3A0AC11C-2A52-4FC4-9ABC-B5BAA2BE160E@gmail.com> References: <4B547288-9D43-4DD5-913A-A6836E4DB3E1@gmail.com> <7c962aed1002162123t6993c99ha0a57ff945523a5a@mail.gmail.com> To: hbase-user@hadoop.apache.org X-Mailer: Apple Mail (2.1077) Thanks for the answers. We're just about ready with our test cluster = and we will try this test specifically. The number of Tomcat servers hitting a common row is currently 40 with = potentially up to 100 threads each (at peak intervals). You also said:=20 > This is hbase. You don't buy bigger hardware, you just add nodes = (smile). Not sure if that was tongue-in-cheek, because adding nodes wouldn't = address the hot row issue would it?? Thanks again Brad On Feb 16, 2010, at 9:23 PM, Stack wrote: > On Tue, Feb 16, 2010 at 7:28 PM, Brad McCarty = wrote: >=20 >> I read in another post that if one has a "hot" row in a table, = meaning very heavy read access to the same row, that the regionserver = managing the region with that row can become a single bottleneck. >>=20 >=20 > If hot, it'll probably get stapled into cache. >=20 >=20 >> Is my understanding accurate? If so, then assuming I can cache the = data in the memstore, will CPU utilization become the likely limiting = resource on that regionserver? >=20 > Yes. That should be the case. >=20 >=20 > Also, if I'm hitting the region server from many client servers > (Tomcat app servers), will the socket connection management overhead > on the regionserver overwhelm that server? >>=20 >=20 > How many clients? 4 or 500 tomcat threads? >=20 > The way the ipc between hbase client and server works is that it keeps > up a single socket connection and multiplexes request/response over > this one connection. This is how hadoop rpc works. >=20 >=20 >> If that's true, are there any other steps that can be taken to = mitigate that risk, other than buying bigger hardware? >>=20 >=20 > This is hbase. You don't buy bigger hardware, you just add nodes = (smile). >=20 > The proper answer to your questions above is for you to give it a test > run. Try setting up a cluster of about 5 hbase nodes and try a tomcat > server requesting playing a query log that resembles what you might > have in production. >=20 > Yours, > St.Ack