Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: pass (nike.apache.org: domain of harsh@cloudera.com designates
 209.85.217.169 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAPQV63UrMs_ymOtEOb7VkXqw4cUwxD6VVU1L2MiLYkAth4Q__g@mail.gmail.com>
References: 
 <CAPQV63XEOiLYEz3EtsEsQ6+ThmkyoO_ai8D1pDXYyW3MrU3Qyw@mail.gmail.com>
 <CA+CM8a-DOO9BEWrOoRoRxeSiVea2pnp4++NJaGBaqL-TnEopYw@mail.gmail.com>
 <CAMiz3FP=Q1=YMi_=JEOi4Ce-8GUkeVfrJ9MQndB4MLjzakxjQQ@mail.gmail.com>
 <CAPQV63Wpm6UpiCuKBCpmv8zzDkvbJKt6Dn_w-Q5ZV1tnFhYiVw@mail.gmail.com>
 <BLU0-SMTP381B1013CCFD115E6F395B18F4B0@phx.gbl>
 <CAMiz3FM4w3+17STrxYKTVXuY1Oeq=3MOpoTWXVb+T0=K=s2Qog@mail.gmail.com>
 <BLU0-SMTP198CB29FFB575859315BEA58F4A0@phx.gbl>
 <CAPQV63UtqQfbNNrsGPsoYDVqj69opx=j4izxO2uSpMyJNrv_eA@mail.gmail.com>
 <0CE69E9126D0344088798A3B7F7F80863AEA3BFC@SZXEML553-MBX.china.huawei.com>
 <CAPQV63UrMs_ymOtEOb7VkXqw4cUwxD6VVU1L2MiLYkAth4Q__g@mail.gmail.com>
From: Harsh J <harsh@cloudera.com>
Date: Wed, 12 Dec 2012 01:50:06 +0530
Message-ID: 
 <CAOcnVr2w1G42GzMhCH9yKLU4-XazT2NnXvcu3WxL2-ks9fyeKg@mail.gmail.com>
Subject: Re: Heterogeneous cluster
To: "user@hbase.apache.org" <user@hbase.apache.org>
Content-Type: text/plain; charset=ISO-8859-1

Hi,

On Wed, Dec 12, 2012 at 12:18 AM, Jean-Marc Spaggiari
<jean-marc@spaggiari.org> wrote:
> Hi Anoop,
>
> Thanks for the clarification.
>
> So let's take one example.
>
> Let's say I have 4 nodes and a replication factor set to 3.
>
> I have a region hosted on N1, replicated on N2 and N3. Nothing about
> this region on N4.

The important bit is, pending further enhancements along this line,
"regions" are not replicated. Region's data is replicated on HDFS, but
a Region itself is not replicated. It is served from a single point
(where it is currently assigned). Region data read requests are done
via the RegionServer layer, not directly from DataNodes (from a client
POV).

> It's time to run a MR, and someone need to work on the given region.
> N1 is to busy, so region will be given to another node. Does it mean
> it will be given randomly between N2, N3 and N4?

HBase jobs submit with the split locations for each region being its
current assignee (at time of submission). This gives the "locality".

> If it's given to N4, it's missing an oportunity to get the data almost locally.

If your task gets assigned to any other node or if the region moves
after the job's begun, the data locality of the reads the regionserver
does may easily be affected, yes.

> Also, if the job is given to N2 or N3, are they going to remotly query
> the data over the network from N1? Or are they able to ready it from
> the replicate? Based on what you are saying, seems that they will
> retrieve it for N1. Is there not another oportunity to improve the
> process by reading from the replicated data and not from the master
> one?

As explained above, all reads go through the assigned regionserver. So
the concept of HDFS block replicas can't be applied here yet (I do
know enhancements around this are planned).

> When you are talking about "the short circuit read option", is  this
> something we need to enable as a property? Or it's more like a piece
> of code?

Its configs, and the speed-drug details are at
http://hbase.apache.org/book.html#perf.hdfs section "11.10.2.
Leveraging local data".

--
Harsh J