Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7465EDCC0 for ; Tue, 11 Dec 2012 20:20:56 +0000 (UTC) Received: (qmail 8283 invoked by uid 500); 11 Dec 2012 20:20:54 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 8219 invoked by uid 500); 11 Dec 2012 20:20:54 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 8211 invoked by uid 99); 11 Dec 2012 20:20:54 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 11 Dec 2012 20:20:54 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of harsh@cloudera.com designates 209.85.217.169 as permitted sender) Received: from [209.85.217.169] (HELO mail-lb0-f169.google.com) (209.85.217.169) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 11 Dec 2012 20:20:47 +0000 Received: by mail-lb0-f169.google.com with SMTP id gk1so4089497lbb.14 for ; Tue, 11 Dec 2012 12:20:27 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:x-gm-message-state; bh=27LbL9QWRxOKlygZ1wer8URtTjf2LgLQKOVd5bSoVVQ=; b=QkzpljmHBB7dvpQMtDQo9O7rVSIuYCBEy8yEH9b2q6eQ72/sgpr11+AZw5qWo5DBWw 3nwF9PJaRjQjDeg0tYiHwblM6newDMkTiduzOwdJ1J8T/FKm0030pgwsnzEAy9hyqy87 NqvMSYlqR1fgVu/5AhDFdez4f87M1JZRLPhPPZtp5d9tq9TDFGucygSL5cJYi4uoUBMH 3Efj0fIwU5F5bVZ+X4wNbYAWSL+gGOwN/jrvWfJEQELu/XjulabJOKjAKnZF/zjkcbl9 yAl1pReibusYIirDbOjKZaVOEIxo7+MXYAQCk4I9tts+YTxsrDziGj+tutpWDrL+Qt6h sROg== Received: by 10.112.49.102 with SMTP id t6mr7758907lbn.60.1355257227031; Tue, 11 Dec 2012 12:20:27 -0800 (PST) MIME-Version: 1.0 Received: by 10.112.24.194 with HTTP; Tue, 11 Dec 2012 12:20:06 -0800 (PST) In-Reply-To: References: <0CE69E9126D0344088798A3B7F7F80863AEA3BFC@SZXEML553-MBX.china.huawei.com> From: Harsh J Date: Wed, 12 Dec 2012 01:50:06 +0530 Message-ID: Subject: Re: Heterogeneous cluster To: "user@hbase.apache.org" Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQmg+h7GBkxpJJQI9jaZsr3Er9/fWIh+ocI4GZLR3aTFX/F2kYkiDtZr9FsLzBEjxG7lg/Fj X-Virus-Checked: Checked by ClamAV on apache.org Hi, On Wed, Dec 12, 2012 at 12:18 AM, Jean-Marc Spaggiari wrote: > Hi Anoop, > > Thanks for the clarification. > > So let's take one example. > > Let's say I have 4 nodes and a replication factor set to 3. > > I have a region hosted on N1, replicated on N2 and N3. Nothing about > this region on N4. The important bit is, pending further enhancements along this line, "regions" are not replicated. Region's data is replicated on HDFS, but a Region itself is not replicated. It is served from a single point (where it is currently assigned). Region data read requests are done via the RegionServer layer, not directly from DataNodes (from a client POV). > It's time to run a MR, and someone need to work on the given region. > N1 is to busy, so region will be given to another node. Does it mean > it will be given randomly between N2, N3 and N4? HBase jobs submit with the split locations for each region being its current assignee (at time of submission). This gives the "locality". > If it's given to N4, it's missing an oportunity to get the data almost locally. If your task gets assigned to any other node or if the region moves after the job's begun, the data locality of the reads the regionserver does may easily be affected, yes. > Also, if the job is given to N2 or N3, are they going to remotly query > the data over the network from N1? Or are they able to ready it from > the replicate? Based on what you are saying, seems that they will > retrieve it for N1. Is there not another oportunity to improve the > process by reading from the replicated data and not from the master > one? As explained above, all reads go through the assigned regionserver. So the concept of HDFS block replicas can't be applied here yet (I do know enhancements around this are planned). > When you are talking about "the short circuit read option", is this > something we need to enable as a property? Or it's more like a piece > of code? Its configs, and the speed-drug details are at http://hbase.apache.org/book.html#perf.hdfs section "11.10.2. Leveraging local data". -- Harsh J