hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Questions about HBase load balancing and HFile
Date Wed, 22 Jan 2014 05:50:16 GMT
bq. capacity load on terms of numbers of regions per region server

I guess you meant to say 'in terms of ...'

Yes. 0.94 load balancer looks at region count only.


On Tue, Jan 21, 2014 at 9:39 PM, Asaf Mesika <asaf.mesika@gmail.com> wrote:

> If hot means many requests, then it's only in 0.96 right? 0.94 is only
> addressing capacity load on terms of numbers of regions per region server
> of the same table.
>
> On Monday, January 20, 2014, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > bq. under heavy load by serving to hot regions
> >
> > Did you mean 'two hot regions' ?
> > If so, the master will move one of them to another RS.
> >
> > Cheers
> >
> >
> > On Mon, Jan 20, 2014 at 6:17 AM, Bill Q <bill.q.hdp@gmail.com> wrote:
> >
> > > Hi Ted and Bharath,
> > > Thanks a lot for the replies.
> > >
> > > For question #1, if there is a RS is under heavy load by serving to hot
> > > regions, the HMaster will move one of the two regions to another RS, or
> > > HMaster will split both of them and move the newly crated halves to
> other
> > > RSs?
> > >
> > > For question #3, does this mean that a HFile has many 64k blocks, but
> > > itself is around 64M (or 128M)?
> > >
> > >
> > > Many thanks.
> > >
> > >
> > > Bill
> > >
> > >
> > > On Mon, Jan 20, 2014 at 1:49 AM, Bharath Vissapragada <
> > > bharathv@cloudera.com
> > > > wrote:
> > >
> > > > For question #3, The block size Lars talks about is the blocksize
> > inside
> > > a
> > > > HFile which is different from HDFS block size. Look at
> > > > http://hbase.apache.org/book/apes03.html . Hfile is indexed as
> blocks
> > to
> > > > facilitate random access to data so that we can skip unnecessary disk
> > > > blocks while gets/scans. Smaller the hfile block size better is the
> > > random
> > > > read performance. You can see the detailed hfile layout in that link.
> > > >
> > > > For question #4, You are correct, since the data resides on HDFS,
> each
> > > > region server has access to all the storefiles (they just use hdfs
> api
> > to
> > > > read them). The reason they are still available after a (RS+datanode)
> > > crash
> > > > is because of the replication in hdfs. The store files still have
> valid
> > > > replicas and namenode tries to maintain the replication factor by
> > > > re-replicating them eventually.
> > > >
> > > >
> > > > On Mon, Jan 20, 2014 at 12:08 PM, Ted Yu <yuzhihong@gmail.com>
> wrote:
> > > >
> > > > > For question #1, there is load balancer in HMaster which does the
> job
> > > of
> > > > > balancing region load.
> > > > >
> > > > > For number 2, the daughter regions stay on the same server as the
> > > parent
> > > > > after split. Later one or both of them may be moved to other region
> > > > servers.
> > > > >
> > > > > Cheers
> > > > >
> > > > > On Jan 19, 2014, at 10:27 PM, Bill Q <bill.q.hdp@gmail.com>
wrote:
> > > > >
> > > > > > Hi,
> > > > > > I am trying to get more information about HBase. I would
> appreciate
> > > > some
> > > > > > answers to these few questions. Thanks a lot.
> > > > > >
> > > > > > 1. About load balancing: does HMaster monitor overloaded or
low
> > > loaded
> > > > > > HRegionServer, and move some regions from the hot HRegionServer
> to
> > > low
> > > > > > loaded ones (with or without add new servers into the cluster,
> > > > > > respectively)?
> > > > > >
> > > > > > 2. About region splitting: when splitting a region, will the
> newly
> > > > > created
> > > > > > regions stay on the current HRegionSever, or will HMaster assign
> > some
> > > > new
> > > > > > HRegionServers to take the newly created two regions?
> > > > > >
> > > > > > 3. About HFile size: Lars mentioned here
> > > > > >
> > > >
> > >
> >
> http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.htmlthat
> > > > > > the HFile size is default to 64k. How does this work while the
> > > default
> > > > > HDFS
> > > > > > block is 64M/128M? Would the small HFile size waste lots of
space
> > on
> > > > > HDFS?
> > > > > >
> > > > > > 4. About data locality: if a HRegionServer fails, the HMaster
> would
> > > > > assign
> > > > > > a new HRegionServer to take its place. But does this new
> > > HRegionServer
> > > > > > should have access to the storeFiles? I assumed that's how it
> works
> > > by
> > > > > > using HDFS's data replication. But after some readings, I got
> > > confused.
> > > > > It
> > > > > > seems that the new HRegionServer can work without the storeFiles
> > data
> > > > a
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message