hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Questions about HBase load balancing and HFile
Date Mon, 20 Jan 2014 15:20:20 GMT
bq. under heavy load by serving to hot regions

Did you mean 'two hot regions' ?
If so, the master will move one of them to another RS.

Cheers


On Mon, Jan 20, 2014 at 6:17 AM, Bill Q <bill.q.hdp@gmail.com> wrote:

> Hi Ted and Bharath,
> Thanks a lot for the replies.
>
> For question #1, if there is a RS is under heavy load by serving to hot
> regions, the HMaster will move one of the two regions to another RS, or
> HMaster will split both of them and move the newly crated halves to other
> RSs?
>
> For question #3, does this mean that a HFile has many 64k blocks, but
> itself is around 64M (or 128M)?
>
>
> Many thanks.
>
>
> Bill
>
>
> On Mon, Jan 20, 2014 at 1:49 AM, Bharath Vissapragada <
> bharathv@cloudera.com
> > wrote:
>
> > For question #3, The block size Lars talks about is the blocksize inside
> a
> > HFile which is different from HDFS block size. Look at
> > http://hbase.apache.org/book/apes03.html . Hfile is indexed as blocks to
> > facilitate random access to data so that we can skip unnecessary disk
> > blocks while gets/scans. Smaller the hfile block size better is the
> random
> > read performance. You can see the detailed hfile layout in that link.
> >
> > For question #4, You are correct, since the data resides on HDFS, each
> > region server has access to all the storefiles (they just use hdfs api to
> > read them). The reason they are still available after a (RS+datanode)
> crash
> > is because of the replication in hdfs. The store files still have valid
> > replicas and namenode tries to maintain the replication factor by
> > re-replicating them eventually.
> >
> >
> > On Mon, Jan 20, 2014 at 12:08 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> > > For question #1, there is load balancer in HMaster which does the job
> of
> > > balancing region load.
> > >
> > > For number 2, the daughter regions stay on the same server as the
> parent
> > > after split. Later one or both of them may be moved to other region
> > servers.
> > >
> > > Cheers
> > >
> > > On Jan 19, 2014, at 10:27 PM, Bill Q <bill.q.hdp@gmail.com> wrote:
> > >
> > > > Hi,
> > > > I am trying to get more information about HBase. I would appreciate
> > some
> > > > answers to these few questions. Thanks a lot.
> > > >
> > > > 1. About load balancing: does HMaster monitor overloaded or low
> loaded
> > > > HRegionServer, and move some regions from the hot HRegionServer to
> low
> > > > loaded ones (with or without add new servers into the cluster,
> > > > respectively)?
> > > >
> > > > 2. About region splitting: when splitting a region, will the newly
> > > created
> > > > regions stay on the current HRegionSever, or will HMaster assign some
> > new
> > > > HRegionServers to take the newly created two regions?
> > > >
> > > > 3. About HFile size: Lars mentioned here
> > > >
> >
> http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.htmlthat
> > > > the HFile size is default to 64k. How does this work while the
> default
> > > HDFS
> > > > block is 64M/128M? Would the small HFile size waste lots of space on
> > > HDFS?
> > > >
> > > > 4. About data locality: if a HRegionServer fails, the HMaster would
> > > assign
> > > > a new HRegionServer to take its place. But does this new
> HRegionServer
> > > > should have access to the storeFiles? I assumed that's how it works
> by
> > > > using HDFS's data replication. But after some readings, I got
> confused.
> > > It
> > > > seems that the new HRegionServer can work without the storeFiles data
> > at
> > > > local. How does this work at all?
> > > >
> > > > Many thanks.
> > > >
> > > >
> > > > Bill
> > >
> >
> >
> >
> > --
> > Bharath Vissapragada
> > <http://www.cloudera.com>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message