accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dlmarion <dlmar...@comcast.net>
Subject Re: Entry-based TableBalancer
Date Thu, 30 Jul 2015 01:06:14 GMT

    
Hotspotting was the first thing that came to my mind with the proposed balancer. The fservers
don't keep all the K/V in memory. You are balancing query and live ingest across your resources.




-------- Original message --------
From: Eric Newton <eric.newton@gmail.com> 
Date: 07/29/2015  8:46 PM  (GMT-05:00) 
To: user@accumulo.apache.org 
Subject: Re: Entry-based TableBalancer 

To my knowledge, nobody has written such a balancer.
In the history of the project, we started writing advanced, complicated balancers that moved
tablets around much too quickly, which degraded performance. After that, we wrote much simpler
balancers to avoid the chaos. We're moving back to more complex balancers, but mostly just
to ensure that we aren't hotspoting, based on known ingest patterns (date related, for example).
If you write a new balancer, make it slow to move tablets, and very simple.  Avoid over-optimizing
tablet placement.
-Eric
On Wed, Jul 29, 2015 at 8:20 PM, Konstantin Pelykh <kpelykh@gmail.com> wrote:
Hi, 

I'm looking for a tablet balancer which operates based on a number of entries per tablet as
opposed to a number of tablets per tablet server. My goal is to get even distribution of entries
across the cluster. 

As an example: 

tablet #1  15M entries
tablet #2   5M entries
tablet #3   8M entries

After balancing tablets I would want to get:

Server 1 hosts: tablet1 
Server 2 hosts: tablet2, tablet3

The idea is pretty simple and I believe such balancer has already been developed, so I decided
to check before reinventing the wheel. 

Thanks!
Konstantin

--------
Big Data / Lucene and Solr Consultant
LinkedIn: linkedin.com/in/kpelykh
Website: www.kpelykh.com



Mime
View raw message