Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BE3CFD229 for ; Thu, 23 Aug 2012 15:16:21 +0000 (UTC) Received: (qmail 76714 invoked by uid 500); 23 Aug 2012 15:16:19 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 76662 invoked by uid 500); 23 Aug 2012 15:16:19 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 76653 invoked by uid 99); 23 Aug 2012 15:16:19 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Aug 2012 15:16:19 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of linlma@gmail.com designates 209.85.212.41 as permitted sender) Received: from [209.85.212.41] (HELO mail-vb0-f41.google.com) (209.85.212.41) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Aug 2012 15:16:13 +0000 Received: by vbkv13 with SMTP id v13so1109668vbk.14 for ; Thu, 23 Aug 2012 08:15:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=e7hZzYzlNqbbNkCyS5lOfszHyVutdugYqtI1vTbkKtg=; b=W80ip+9srbKfY+la32d6x5L1gcjnHVS+y77KSTbDcVUb7b7PwUj/VxjZT5bWTibNlr pvJLBJ9RIdgRuySx+U0pDvohwFIwBaUz8T9imbccP91GogBPmsmbT/8EejZFu17+4Mc2 72DepYrhXi/74Utydo0YPs38ybebbZxKsIytB28KPH9qMyotKg6OdrfYEPPplhI/FJ7J PtSDksT3/yKX7So8c+9khIcwddwh1KKuEM89q9i1jmbD4//1vEknoHQgFJ78D0uH6zHH V41tLKqMW2xm4l87ENRoFS+XJLl+9oT/gZlm6Bc1I4MUxZJIfN+BqRYb8+NQZsUGc5Um e2Cw== MIME-Version: 1.0 Received: by 10.58.80.66 with SMTP id p2mr1750388vex.16.1345734952244; Thu, 23 Aug 2012 08:15:52 -0700 (PDT) Received: by 10.58.169.6 with HTTP; Thu, 23 Aug 2012 08:15:52 -0700 (PDT) In-Reply-To: References: Date: Thu, 23 Aug 2012 23:15:52 +0800 Message-ID: Subject: Re: how client location a region/tablet? From: Lin Ma To: user@hbase.apache.org, doug.meil@explorysmedical.com Content-Type: multipart/alternative; boundary=047d7b5d33e4ca574004c7f05572 --047d7b5d33e4ca574004c7f05572 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Doug, very informative document. Thanks a lot! I read through it and have some thoughts, - Supposing at the beginning, client side cache for region information is empty, and the client wants to GET row-key 123 from table ABC; - The client will read from ROOT table at first. But unfortunately, ROOT table only contains region information for META table (please correct me if I am wrong), but not region information for real data table (e.g. table ABC); - Does the client have to call each META region server one by one, in order to find which META region contains information for region owner of row-key 123 of data table ABC? BTW: I think if there is a way to expose information about what range of table/region each META region contains from .META. region key, it will be better to save time to iterate META region server one by one. Please feel free to correct me if I am wrong. regards, Lin On Thu, Aug 23, 2012 at 8:21 PM, Doug Meil w= rote: > > For further information about the catalog tables and region-regionserver > assignment, see this=C5=A0 > > http://hbase.apache.org/book.html#arch.catalog > > > > > > > On 8/19/12 7:36 AM, "Lin Ma" wrote: > > >Thank you Stack, especially for the smart 6 round trip guess for the > >puzzle. :-) > > > >1. "Yeah, we client cache's locations, not the data." -- does it mean fo= r > >each client, it will cache all location information of a HBase cluster, > >i.e. which physical server owns which region? Supposing each region has > >128M bytes, for a big cluster (P-bytes level), total data size / 128M is > >not a trivial number, not sure if any overhead to client? > >2. A bit confused by what do you mean "not the data"? For the client > >cached > >location information, it should be the data in table METADATA, which is > >region / physical server mapping data. Why you say not data (do you mean > >real content in each region)? > > > >regards, > >Lin > > > >On Sun, Aug 19, 2012 at 12:40 PM, Stack wrote: > > > >> On Sat, Aug 18, 2012 at 2:13 AM, Lin Ma wrote: > >> > Hello guys, > >> > > >> > I am referencing the Big Table paper about how a client locates a > >>tablet. > >> > In section 5.1 Tablet location, it is mentioned that client will cac= he > >> all > >> > tablet locations, I think it means client will cache root tablet in > >> > METADATA table, and all other tablets in METADATA table (which means > >> client > >> > cache the whole METADATA table?). My question is, whether HBase > >> implements > >> > in the same or similar way? My concern or confusion is, supposing ea= ch > >> > tablet or region file is 128M bytes, it will be very huge space (i.e= . > >> > memory footprint) for each client to cache all tablets or region > >>files of > >> > METADATA table. Is it doable or feasible in real HBase clusters? > >>Thanks. > >> > > >> > >> Yeah, we client cache's locations, not the data. > >> > >> > >> > BTW: another confusion from me is in the paper of Big Table section > >>5.1 > >> > Tablet location, it is mentioned that "If the client=C2=B9s cache is= stale, > >> the > >> > location algorithm could take up to six round-trips, because stale > >>cache > >> > entries are only discovered upon misses (assuming that METADATA > >>tablets > >> do > >> > not move very frequently).", I do not know how the 6 times round tri= p > >> time > >> > is calculated, if anyone could answer this puzzle, it will be great. > >>:-) > >> > > >> > >> I'm not sure what the 6 is about either. Here is a guesstimate: > >> > >> 1. Go to cached location for a server for a particular user region, > >> but server says that it does not have a region, the client location is > >> stale > >> 2. Go back to client cached meta region that holds user region w/ row > >> we want, but its location is stale. > >> 3. Go to root location, to find new location of meta, but the root > >> location has moved.... what the client has is stale > >> 4. Find new root location and do lookup of meta region location > >> 5. Go to meta region location to find new user region > >> 6. Go to server w/ user region > >> > >> St.Ack > >> > > > --047d7b5d33e4ca574004c7f05572--