hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bharath Vissapragada <bhara...@cloudera.com>
Subject Re: DFS Balancer with Hbase
Date Tue, 04 Mar 2014 14:22:02 GMT
Yes, its included in 0.94.2. Include this property in master's
hbase-site.xml and requires a master restart. Allow the balancer to run and
make sure the new assignment is in place and then run a major compaction
during some maintenance window. Running major compaction once a week
atleast is suggested since it clears up the data corresponding to deletes
(which in your case is useful) and also improves block locality index for
RS local reads.


On Tue, Mar 4, 2014 at 7:33 PM, divye sheth <divs.sheth@gmail.com> wrote:

> Thanks Bharath, I had a look at the jira and the fix version is 0.94.0. I
> am using 0.94.2, I assume this should already be a part of Hbase. Is the
> assumption correct? If no do I have to make these changes in
> hbase-site.xml?
>
> Note: We have not run major_compaction for any of the tables so far. Will
> doing so mitigate the issue to some extent?
>
> Thanks
> Divye Sheth
>
>
> On Tue, Mar 4, 2014 at 7:10 PM, Bharath Vissapragada
> <bharathv@cloudera.com>wrote:
>
> > Did you check the per table balancer? (HBASE-3373),
> > hbase.master.loadbalance.bytable=true
> >
> > Default load balancer just balances based on metric region count per
> table
> > which can result in all big regions from a single table falling on one RS
> > thus overloading it. This might be one of the reasons and you can confirm
> > it from current region assignment.
> >
> > Do a major compaction after enabling this setting and regions are
> balanced
> > so that the newly written hfiles are uniformly distributed.
> >
> >
> >
> >
> > On Tue, Mar 4, 2014 at 6:54 PM, divye sheth <divs.sheth@gmail.com>
> wrote:
> >
> > > Thanks Jean, but why does only a couple of RS get loaded with data? We
> > are
> > > seeing out of 5 only 2 datanodes have around 90% of disk usage. Where
> as
> > > the rest are at around 40%.
> > >
> > > We have run the hbase balancer, and on an average we have around 500
> > > regions per regionserver and a total of 5 RS's. We have even disabled
> > > number of tables which are not required and currently the count of
> > > regions/RS is around 120.
> > >
> > > Another question that comes to my mind is. Somewhere down the line the
> > > Hadoop cluster tends to be imbalanced and lead to 100% disk utilization
> > and
> > > the balancer activity has to be triggered, how do you guys handle such
> > > problem in your hbase cluster?
> > >
> > > Just a thought, could we execute the DFS balancer and after the
> balancing
> > > activity trigger major compaction for each table?
> > >
> > > Thanks
> > > Divye Sheth
> > >
> > >
> > > On Tue, Mar 4, 2014 at 6:45 PM, Jean-Marc Spaggiari <
> > > jean-marc@spaggiari.org
> > > > wrote:
> > >
> > > > Hi Divye,
> > > >
> > > > the DFS balancer is that last thing you want to run in your HBase
> > > > cluster.That will break all the data locallity for the compacted
> > regions.
> > > >
> > > > On compaction, a region write the files on the local server first,
> then
> > > the
> > > > 2 other replicates are going on different datanodes. so on read,
> HBase
> > > can
> > > > garantee that data is read from local datanode dans not from another
> > > > datanode over the network.
> > > >
> > > > Have you run the HBase balancer? How many regions do you have per
> > region
> > > > server?
> > > >
> > > > JM
> > > >
> > >
> >
> >
> >
> > --
> > Bharath Vissapragada
> > <http://www.cloudera.com>
> >
>



-- 
Bharath Vissapragada
<http://www.cloudera.com>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message