Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 261704BEA for ; Thu, 9 Jun 2011 01:02:48 +0000 (UTC) Received: (qmail 71370 invoked by uid 500); 9 Jun 2011 01:02:46 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 71339 invoked by uid 500); 9 Jun 2011 01:02:46 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 71331 invoked by uid 99); 9 Jun 2011 01:02:46 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Jun 2011 01:02:46 +0000 X-ASF-Spam-Status: No, hits=0.0 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [98.138.91.179] (HELO nm19-vm4.bullet.mail.ne1.yahoo.com) (98.138.91.179) by apache.org (qpsmtpd/0.29) with SMTP; Thu, 09 Jun 2011 01:02:39 +0000 Received: from [98.138.90.51] by nm19.bullet.mail.ne1.yahoo.com with NNFMP; 09 Jun 2011 01:02:17 -0000 Received: from [98.138.89.197] by tm4.bullet.mail.ne1.yahoo.com with NNFMP; 09 Jun 2011 01:02:17 -0000 Received: from [127.0.0.1] by omp1055.mail.ne1.yahoo.com with NNFMP; 09 Jun 2011 01:02:17 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 96383.88515.bm@omp1055.mail.ne1.yahoo.com Received: (qmail 59293 invoked by uid 60001); 9 Jun 2011 01:02:16 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1307581336; bh=SbLJEng2t7yYaB7rI0dybnOuaIOvyIpglkuOhFm8oCE=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=DuNO5OkDrdxYslexpUdAgUzwTpJLwGDBX0Vc9XVk1SYw1p1oNHgg0T7D+/Nlc1jVyT5/9UZT1LWDF0w3yIXwc8iEsVeC7W3H5xNNDJt0rmes2NU422eeiEuAWy6/TOtoVWbY/4ugwZDDyJoyztMuCiN6xvyRMwK5fUuTAokKhxs= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=NPJa8t9lC/mYqvTUFvKieRWrHOC1JH5L4ly/cXNnK12ZnAFWJbuNDpBqlHIyDBWUJ/ymonnmbOZppKyVoN2HEXdtzsRZBxbHk6znGgqGOftwRovYKP7iojSYtGpTjplTenN/cQTbHbxksO8XI8qv4h0472Z2lMO5z4m0qH5M2O8=; Message-ID: <665975.57372.qm@web130110.mail.mud.yahoo.com> X-YMail-OSG: ayXdnNYVM1l836AOG7knbir3g8cDdquVt5Mb.Mp6W7fZF_a FsGf5y50ixXRZy70fhSd9qecFJoF4TwgYyKUjR_pWvJpvsA1TmmPW1GK1DYi Eh4zkPckfGZktMCoHDQJ5PphAdwO4t4bl97yojTDyCWgwGv4aaxppMD0aK7F 7i0ACXqa_4RzVteya4Gbr1eUB7hNkyPDDQD_bflI9CQkUUrnWGYTHsWxEvJn GLCw7ZaB7OpKCX_z6I1bSKLuFY9NWZ0AGJKZVdJVDFIk4.nTQHASnOpjRVPd OwXc513ESr7WdvfkvwVvJDSJUzddRv90udwVg0eTl4U.p7St_b9zRJV68Jhh C486KV1ZNDy0Z9bCdGf1ttmZBT2oaeDlCh48RjCvG.JtGMnrsEx.SSFDL3Ph Zn8XKEFxQBl2fytB9v3ed70lIkgNtvWfNazk2pLHPA9IuHUiGXxtcfrGtol9 nJDqoiWeT7ZQFMlr.KK5nzjAwZ7FjrFMelrUwfVVYfyAMdO2iq3vpmpBEcCg EvRH2HGrNOMQuk_su2Ow7L5b2ew5v3rKh0dR39acWa4Em7oogurzXYWZ6ueL mUABsRAbF Received: from [92.79.129.68] by web130110.mail.mud.yahoo.com via HTTP; Wed, 08 Jun 2011 18:02:16 PDT X-Mailer: YahooMailRC/570 YahooMailWebService/0.8.111.304355 References: <2F0C261C-7A66-481E-8E03-1BBAA62DEAF1@cloudera.com> <293444.54619.qm@web114017.mail.gq1.yahoo.com> Date: Wed, 8 Jun 2011 18:02:16 -0700 (PDT) From: Otis Gospodnetic Subject: Re: hbase hashing algorithm and schema design To: user@hbase.apache.org In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Checked: Checked by ClamAV on apache.org Sam, would HBaseWD help you here? See http://search-hadoop.com/m/AQ7CG2GkiO/hbasewd&subj=+ANN+HBaseWD+Distribute+Sequential+Writes+in+HBase Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Hadoop - HBase Hadoop ecosystem search :: http://search-hadoop.com/ ----- Original Message ---- > From: Sam Seigal > To: user@hbase.apache.org > Cc: joey@cloudera.com; tsunanet@gmail.com > Sent: Wed, June 8, 2011 4:54:24 PM > Subject: Re: hbase hashing algorithm and schema design > > On Wed, Jun 8, 2011 at 12:40 AM, tsuna wrote: > > > On Tue, Jun 7, 2011 at 7:56 PM, Kjew Jned wrote: > > > I was studying the OpenTSDB example, where they also prefix the row keys > > with > > > event id. > > > > > > I further modified my row keys to have this -> > > > > > > > > > > > > The uuid is fairly unique and random. > > > Is appending a uuid to the event id help the distribution ? > > > > Yes it will help the distribution, but it will also make certain query > > patterns harder. You can no longer scan for a time range, for a given > > eventid. How to solve this problem depends on how you generate the > > UUIDs. > > > > I wouldn't recommend doing this unless you've already tried simpler > > approaches and reached the conclusion that they don't work. Many > > people seem to be afraid of creating hot spots in their tables without > > having first-hand evidence that the hot spots would actually be a > > problem. > > > > > > > > > Can I not use regex row filters to query for date ranges ? There is an added > overhead for the client to > order them, and it is not an efficient query, but it is certainly possible > to do so ? Am I wrong ? > > > > > > Let us say if I have 4 region servers to start off with and I start the > > > > If you have only 4 region servers, your goal should be to have roughly > > 25% of writes going to each server. It doesn't matter if the 25% > > slice of one server is going to a single region or not. As long as > > all the writes don't go to the same row (which would cause lock > > contention on that row), you'll get the same kind of performance. > > > > I am worried about the following scenario, hence putting a lot of thought > into how to design this schema. > > For example, for simplicity sake, I only have two event Ids A and B, and the > traffic is equally distributed > between them i.e. 50% of my traffic is event A and 50% is event B. I have > two region servers running, on > two physical nodes with the following schema - > > > > Ideally, I now have all of A traffic going into regionServerA and all of B > traffic going into regionserver B. > The cluster is able to hold this traffic, and the write load is distributed > 50-50. > > However, now I reach a point where I need to scale, since the two clusters > are not being able to > cope with the write traffic. Adding extra regionservers to the cluster is > not going to make any difference > , since only the physical machine holding the tail end of the region is the > one that will receive > the traffic. Most of my other cluster is going to be idle. > > To generalize, if I want to scale where the # of machines is greater than > the # of unique event ids, I have no way to > distribute the load in an efficient manner, since I cannot distribute the > load of a single event id across multiple machines > (without adding a uuid somewhere in the middle and sacrificing data locality > on ordered timestamps). > > Do you think my concern is genuine ? Thanks a lot for your help. > > > > > > > workload, how does HBase decide how many regions is it going to create, > > and what > > > key is going to go into what region ? > > > > Your table starts as a single region. As this region fills up, it'll > > split. Where it split is chosen by HBase. HBase tries to spit the > > region "in the middle", so that roughly the number of keys ends up in > > each new daughter region. > > > > You can also manually pre-split a table (from the shell). This can be > > advantageous in certain situations where you know what your table will > > look like and you have a very high write volume coupled with > > aggressive latency requirements for >95th percentile. > > > > > > I could have gone with something like > > > > > > , but would not like to, since my queries are > > always > > > going to be against a particular event id type, and i would like them to > > be > > > spatially located. > > > > If you have a lot of data per , then putting the in > > between the and the will screw up data locality > > anyway. But the exact details depend on how you pick the . > > > > -- > > Benoit "tsuna" Sigoure > > Software Engineer @ www.StumbleUpon.com > > >