hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <la...@apache.org>
Subject Re: DFS Balancer with Hbase
Date Wed, 05 Mar 2014 05:57:33 GMT
That's a surprising change in behavior.
It seems like it was intentional in HBASE-6849, but it will catch folks by surprise.

-- Lars



________________________________
 From: Bharath Vissapragada <bharathv@cloudera.com>
To: user@hbase.apache.org; lars hofhansl <larsh@apache.org> 
Sent: Tuesday, March 4, 2014 6:43 PM
Subject: Re: DFS Balancer with Hbase
 

Thanks for correcting me Lars. Yes its enabled by default in 0.94.2 but
disabled in trunk after HBASE-6849.

Divye, you can still check the block locality index and do a major
compaction and see how it goes.


On Wed, Mar 5, 2014 at 6:02 AM, lars hofhansl <larsh@apache.org> wrote:

> Looks like it's on by default.
>
>
>
> ________________________________
>  From: Bharath Vissapragada <bharathv@cloudera.com>
> To: user@hbase.apache.org
> Sent: Tuesday, March 4, 2014 6:22 AM
> Subject: Re: DFS Balancer with Hbase
>
>
> Yes, its included in 0.94.2. Include this property in master's
> hbase-site.xml and requires a master restart. Allow the balancer to run and
> make sure the new assignment is in place and then run a major compaction
> during some maintenance window. Running major compaction once a week
> atleast is suggested since it clears up the data corresponding to deletes
> (which in your case is useful) and also improves block locality index for
> RS local reads.
>
>
> On Tue, Mar 4, 2014 at 7:33 PM, divye sheth <divs.sheth@gmail.com> wrote:
>
> > Thanks Bharath, I had a look at the jira and the fix version is 0.94.0. I
> > am using 0.94.2, I assume this should already be a part of Hbase. Is the
> > assumption correct? If no do I have to make these changes in
> > hbase-site.xml?
> >
> > Note: We have not run major_compaction for any of the tables so far. Will
> > doing so mitigate the issue to some extent?
> >
> > Thanks
> > Divye Sheth
> >
> >
> > On Tue, Mar 4, 2014 at 7:10 PM, Bharath Vissapragada
> > <bharathv@cloudera.com>wrote:
> >
> > > Did you check the per table balancer? (HBASE-3373),
> > > hbase.master.loadbalance.bytable=true
> > >
> > > Default load balancer just balances based on metric region count per
> > table
> > > which can result in all big regions from a single table falling on one
> RS
> > > thus overloading it. This might be one of the reasons and you can
> confirm
> > > it from current region assignment.
> > >
> > > Do a major compaction after enabling this setting and regions are
> > balanced
> > > so that the newly written hfiles are uniformly distributed.
> > >
> > >
> > >
> > >
> > > On Tue, Mar 4, 2014 at 6:54 PM, divye sheth <divs.sheth@gmail.com>
> > wrote:
> > >
> > > > Thanks Jean, but why does only a couple of RS get loaded with data?
> We
> > > are
> > > > seeing out of 5 only 2 datanodes have around 90% of disk usage. Where
> > as
> > > > the rest are at around 40%.
> > > >
> > > > We have run the hbase balancer, and on an average we have around 500
> > > > regions per regionserver and a total of 5 RS's. We have even disabled
> > > > number of tables which are not required and currently the count of
> > > > regions/RS is around 120.
> > > >
> > > > Another question that comes to my mind is. Somewhere down the line
> the
> > > > Hadoop cluster tends to be imbalanced and lead to 100% disk
> utilization
> > > and
> > > > the balancer activity has to be triggered, how do you guys handle
> such
> > > > problem in your hbase cluster?
> > > >
> > > > Just a thought, could we execute the DFS balancer and after the
> > balancing
> > > > activity trigger major compaction for each table?
> > > >
> > > > Thanks
> > > > Divye Sheth
> > > >
> > > >
> > > > On Tue, Mar 4, 2014 at 6:45 PM, Jean-Marc Spaggiari <
> > > > jean-marc@spaggiari.org
> > > > > wrote:
> > > >
> > > > > Hi Divye,
> > > > >
> > > > > the DFS balancer is that last thing you want to run in your HBase
> > > > > cluster.That will break all the data locallity for the compacted
> > > regions.
> > > > >
> > > > > On compaction, a region write the files on the local server first,
> > then
> > > > the
> > > > > 2 other replicates are going on different datanodes. so on read,
> > HBase
> > > > can
> > > > > garantee that data is read from local datanode dans not from
> another
> > > > > datanode over the network.
> > > > >
> > > > > Have you run the HBase balancer? How many regions do you have per
> > > region
> > > > > server?
> > > > >
> > > > > JM
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Bharath Vissapragada
> > > <http://www.cloudera.com>

>
> > >
> >
>
>
>
> --
> Bharath Vissapragada
> <http://www.cloudera.com>




-- 
Bharath Vissapragada
<http://www.cloudera.com>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message