accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Newton <eric.new...@gmail.com>
Subject Re: Entry-based TableBalancer
Date Thu, 30 Jul 2015 00:46:00 GMT
To my knowledge, nobody has written such a balancer.

In the history of the project, we started writing advanced, complicated
balancers that moved tablets around much too quickly, which degraded
performance. After that, we wrote much simpler balancers to avoid the
chaos. We're moving back to more complex balancers, but mostly just to
ensure that we aren't hotspoting, based on known ingest patterns (date
related, for example).

If you write a new balancer, make it slow to move tablets, and very
simple.  Avoid over-optimizing tablet placement.

-Eric

On Wed, Jul 29, 2015 at 8:20 PM, Konstantin Pelykh <kpelykh@gmail.com>
wrote:

> Hi,
>
> I'm looking for a tablet balancer which operates based on a number of
> entries per tablet as opposed to a number of tablets per tablet server. My
> goal is to get even distribution of entries across the cluster.
>
> As an example:
>
> tablet #1  15M entries
> tablet #2   5M entries
> tablet #3   8M entries
>
> After balancing tablets I would want to get:
>
> Server 1 hosts: tablet1
> Server 2 hosts: tablet2, tablet3
>
> The idea is pretty simple and I believe such balancer has already been
> developed, so I decided to check before reinventing the wheel.
>
> Thanks!
> Konstantin
>
> --------
> Big Data / Lucene and Solr Consultant
> LinkedIn: linkedin.com/in/kpelykh <http://www.linkedin.com/in/kpelykh>
> Website: www.kpelykh.com
>

Mime
View raw message