Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 87665 invoked from network); 15 Jun 2009 18:17:35 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 15 Jun 2009 18:17:35 -0000 Received: (qmail 71537 invoked by uid 500); 15 Jun 2009 18:17:46 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 71490 invoked by uid 500); 15 Jun 2009 18:17:46 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 71480 invoked by uid 99); 15 Jun 2009 18:17:46 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Jun 2009 18:17:46 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of bradfordstephens@gmail.com designates 209.85.217.215 as permitted sender) Received: from [209.85.217.215] (HELO mail-gx0-f215.google.com) (209.85.217.215) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Jun 2009 18:17:38 +0000 Received: by gxk11 with SMTP id 11so6127747gxk.5 for ; Mon, 15 Jun 2009 11:17:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=hEQ3dFuDBUchkr63yxfcCODzdd7LRB41OrBQfp5pkFY=; b=B+CpKjdfLlEKlfc4CV4fv3sRG41UP/U10amGyN0a90X6QmbWuJ+1Dxma9G+eWWl8bm mndq9aM4+CTGj9DrKPqWxhvX6hYQDy2X6c96UZu7zGIyvx30uDknMNYoioFYhAera2t8 SHvAfrN/FECVRq0HX9mKUU9rC0fsNA9EXkWD4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=jLxZDki9HVKS4ERli9vPvaWdb378yMfGJnVEai18JZLUR5iNZtItK7Q2vJ3oLRTOCq HnAa4qQ/p2fpf53x8z4A7C0ej9kb9VwzksERIf8WcJpZ9/bJL6AiFaKm5I00YpD0hZby MQgoraF7JHjTQ7MmAePCfgRwoGg1ZscLDBtQk= MIME-Version: 1.0 Received: by 10.90.118.8 with SMTP id q8mr4524845agc.95.1245089836382; Mon, 15 Jun 2009 11:17:16 -0700 (PDT) In-Reply-To: References: <860544ed0906111810l2f80be29x8bc08a7463fc2b4b@mail.gmail.com> <7c962aed0906112147i2cc699d3j8a193bc7c37d6255@mail.gmail.com> <860544ed0906112219x8d10ab0p78287d2d5372ce06@mail.gmail.com> Date: Mon, 15 Jun 2009 11:17:16 -0700 Message-ID: <860544ed0906151117l3ccad5ecxfb440386ca337c78@mail.gmail.com> Subject: Re: HBase Write to Regionservers behavior From: Bradford Stephens To: hbase-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Right now, we're storing the documents in HBase. The indices are stored in HDFS and then 'sharded' to each node using Katta. Not sure if there's much of an advantage to storing the index itself in HBase, though I'd be interested to see some use cases for it. On Sat, Jun 13, 2009 at 11:27 AM, zsongbo wrote: > Hi Bradford Stephens, > Could you please share something about your practices on "Katta+HBase"? > Do you store the documents or indexes in HBase? > > Schubert > > On Fri, Jun 12, 2009 at 1:19 PM, Bradford Stephens < > bradfordstephens@gmail.com> wrote: > >> That actually make a lot of sense. Thanks, awesome people! Me and the >> dev team are here to get Katta + HBase to play together, and it's >> looking pretty nice. >> >> On Thu, Jun 11, 2009 at 9:47 PM, stack wrote: >> > On Thu, Jun 11, 2009 at 6:10 PM, Bradford Stephens < >> > bradfordstephens@gmail.com> wrote: >> > >> >> >> >> What I'm noticing is that it's writing to mostly one or two regions o= n >> >> one box at a time, even though I have 7 reducers running. Monitoring >> >> everything with dstat -v, I notice that only 2 of my servers are doin= g >> >> much. These boxes have very low CPU idling, and high disk output (a >> >> few GB a minute). >> >> >> > >> > >> > How many regions in your table? >> > >> > At first, there is one. =A0All reducers will go against it. =A0 When i= t >> splits, >> > then two regions field the 7 reducers and so on. >> > >> > You can manually split regions from the command-line. =A0See if that h= elps: >> > >> > hbase> split_region 'REGIONNAME' >> > >> > (IIRC -- type 'tools' in shell for help on the admin facilities). >> > >> > St.Ack >> > >> >