accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Pelykh <kpel...@gmail.com>
Subject Re: Entry-based TableBalancer
Date Thu, 30 Jul 2015 02:16:05 GMT
In this specific case, ingest happens only once. It's write-once, read-many
type of application, so with such balancer I would want to balance tablets
based on number of entities after ingest is fully complete.

--------
Big Data / Search Consultant
Cell: +1 (646) 639-3916
E-mail: kpelykh@gmail.com
LinkedIn: linkedin.com/in/kpelykh <http://www.linkedin.com/in/kpelykh>
Website: www.kpelykh.com

On Wed, Jul 29, 2015 at 6:06 PM, dlmarion <dlmarion@comcast.net> wrote:

> Hotspotting was the first thing that came to my mind with the proposed
> balancer. The fservers don't keep all the K/V in memory. You are balancing
> query and live ingest across your resources.
>
>
>
>
>
> -------- Original message --------
> From: Eric Newton <eric.newton@gmail.com>
> Date: 07/29/2015 8:46 PM (GMT-05:00)
> To: user@accumulo.apache.org
> Subject: Re: Entry-based TableBalancer
>
> To my knowledge, nobody has written such a balancer.
>
> In the history of the project, we started writing advanced, complicated
> balancers that moved tablets around much too quickly, which degraded
> performance. After that, we wrote much simpler balancers to avoid the
> chaos. We're moving back to more complex balancers, but mostly just to
> ensure that we aren't hotspoting, based on known ingest patterns (date
> related, for example).
>
> If you write a new balancer, make it slow to move tablets, and very
> simple.  Avoid over-optimizing tablet placement.
>
> -Eric
>
> On Wed, Jul 29, 2015 at 8:20 PM, Konstantin Pelykh <kpelykh@gmail.com>
> wrote:
>
>> Hi,
>>
>> I'm looking for a tablet balancer which operates based on a number of
>> entries per tablet as opposed to a number of tablets per tablet server. My
>> goal is to get even distribution of entries across the cluster.
>>
>> As an example:
>>
>> tablet #1  15M entries
>> tablet #2   5M entries
>> tablet #3   8M entries
>>
>> After balancing tablets I would want to get:
>>
>> Server 1 hosts: tablet1
>> Server 2 hosts: tablet2, tablet3
>>
>> The idea is pretty simple and I believe such balancer has already been
>> developed, so I decided to check before reinventing the wheel.
>>
>> Thanks!
>> Konstantin
>>
>> --------
>> Big Data / Lucene and Solr Consultant
>> LinkedIn: linkedin.com/in/kpelykh <http://www.linkedin.com/in/kpelykh>
>> Website: www.kpelykh.com
>>
>
>

Mime
View raw message