hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From john smith <js1987.sm...@gmail.com>
Subject Re: Region assignment in Hbase
Date Tue, 30 Mar 2010 02:49:26 GMT
J-D thanks for your reply. I have some doubts which I posted inline . Kindly
help me

On Tue, Mar 30, 2010 at 2:23 AM, Jean-Daniel Cryans <jdcryans@apache.org>wrote:

> Inline.
> J-D
> On Mon, Mar 29, 2010 at 11:45 AM, john smith <js1987.smith@gmail.com>
> wrote:
> > Hi all,
> >
> > I read the issue HBase-57 (
> https://issues.apache.org/jira/browse/HBASE-57 )
> > . I don't really understand the use of assigning regions keeping DFS in
> > mind. Can anyone give an example usecase showing its advantages
> A region is composed of files, files are composed of blocks. To read
> data, you need to fetch those blocks. In HDFS you normally have access
> to 3 replicas and you fetch one of them over the network. If one of
> the replica is on the local datanode, you don't need to go through the
> network. This means less network traffic and better response time.

Is this the scenario that occurs for catering the read requests?  In the
thread "Data distribution in HBase" , one of the people mentioned that the
data hosted by the Region Server may not actually reside on the same machine
. So when asked for data , it fetches from the system containing the data.
Am I right?  Why is the data hosted by a "Region Server" doesn't lie on the
same machine . Doesn't the name name "Region Server" imply that it holds all
the regions it contains? Is it due to splits or restarting the HBase ?

> > Can
> > map-reduce exploit it's advantage in any way (if data is distributed in
> the
> > above manner)  or is it just the read-write performance that gets
> improved .
> MapReduce works in the exact same way, it always tries to put the
> computation next to where the data is. I recommend reading the
> MapReduce tutorial
> http://hadoop.apache.org/common/docs/r0.20.0/mapred_tutorial.html#Overview

Also the same case Applies here I guess . When a map is run on a Region
Server, It's data may not actually lie on the same machine . So it fetches
from the machine containing it. This reduces the data locality !

> > Can some one please help me in understanding this.
> >
> > Regards
> > JS
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message