Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2560B90FA for ; Mon, 18 Jun 2012 18:50:00 +0000 (UTC) Received: (qmail 63375 invoked by uid 500); 18 Jun 2012 18:49:58 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 63320 invoked by uid 500); 18 Jun 2012 18:49:58 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 63249 invoked by uid 99); 18 Jun 2012 18:49:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 Jun 2012 18:49:58 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of igznick01@gmail.com designates 209.85.214.41 as permitted sender) Received: from [209.85.214.41] (HELO mail-bk0-f41.google.com) (209.85.214.41) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 Jun 2012 18:49:51 +0000 Received: by bkcjm19 with SMTP id jm19so5648773bkc.14 for ; Mon, 18 Jun 2012 11:49:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=lv9d/gSCGZj6GGx8o72LyvbOloC/6+oAnQlZBCUJENs=; b=myWJmajSNi6xUo5edoV2Aq8pcTiNDOSooPZ2z7UDbFPeULESZsJLSS2LRLG8lJYG7j MfYTvzBa8c5kzjcSgEIZIpFXzVPrB/SLLF5uHvui6WFhUJEE8Hmou/Hy0L++iPaHSt9h yI4qqSVa9C7tbeS/2DaXmtQs4itQsHTTyZJ0+JzErHzCycy7RNPI0YGZ44Zl2VFwuXUl n2fMd6V2lAA3FKgkg4kDbAGbXG2IxqI6VGt0JcGHJcORmR6N2DZ6UekndgeqpL6vGYmc JjVlRdESdNRVf7NGZm9/V8umbM6AOg5fCWwOTwkEyLr3mxNwge5sSSHdkZKh07E39uXA P1SQ== MIME-Version: 1.0 Received: by 10.204.155.148 with SMTP id s20mr6691434bkw.56.1340045370901; Mon, 18 Jun 2012 11:49:30 -0700 (PDT) Received: by 10.204.184.10 with HTTP; Mon, 18 Jun 2012 11:49:30 -0700 (PDT) In-Reply-To: References: Date: Tue, 19 Jun 2012 00:19:30 +0530 Message-ID: Subject: Re: How does scan work internally? Does it make use of multi-threading/replication? From: IGZ Nick To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=000e0cdf740850c5c904c2c3a010 --000e0cdf740850c5c904c2c3a010 Content-Type: text/plain; charset=ISO-8859-1 Okay. Let me ask a more specific example. Say I have 3 contiguous regions, all server by one RS. So if I do a scan which gets data from each of the regions, then everything has to come through this RS, which would be slow. Or is there any optimization such that continuous regions don't end up being server by the same regionserver? On Tue, Jun 19, 2012 at 12:11 AM, Jean-Daniel Cryans wrote: > On Mon, Jun 18, 2012 at 11:34 AM, IGZ Nick wrote: > > Hi Jean, > > > > Thank you for your reply. So RS is a completely different entity when > > compared to the datanode? > > Totally. > > > How does RS server the data? > > That's HBase 101, I recommend you read the guide > http://hbase.apache.org/book/book.html or the book > http://ofps.oreilly.com/titles/9781449396107/ or the bigtable paper. > > > I can view the > > region directories in HDFS. So the same region must be on 3 datanodes, > > right? > > Yep. > > > Then which regionserver gets to serve that region? > > HBase 101, but in short the master decides that. > > > Is it a > > completely random regionserver? > > The master uses a few heuristics. > > > And if I ask that region server for all > > keys from that region, will it have to come from the same HDFS datanode? > > Depends if the data is there, if it is then it will be served locally > else it will be fetched. It doesn't really matter to the region server > since the HDFS client handles it transparently. > > > As > > far as I understand, in HDFS, if I stream a file, then I get the data > from > > a single datanode (the one closest to the client, usually). So, in > HBase, I > > ask for all keys in region reg1, then I get all the keys from the > datanode > > that is closest to the client? > > Yep > > J-D > --000e0cdf740850c5c904c2c3a010--