accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Pelykh <kpel...@gmail.com>
Subject Re: Entry-based TableBalancer
Date Thu, 30 Jul 2015 17:10:58 GMT
Thanks for a suggestion, bellow are some details explaining the reason for
such balancer:
I'm basing my application on accumulo-wikipedia example, so there can be
multiple partitions per tablet. Some partitions are larger others are
smaller. There's a possibility to split partition range manually afger
ingestion is complete and rely on default balancer to spread tablets
accross cluster, however in this case some servers end up overloaded
compared to others.
Currently the slowest server (hosting the largest tablet) defines final
time for search query, so I want to distribute entities accorss the cluster
so that they are well balanced and all servers spend simillir amount of
time processing documents though OptimizedQueryIterators.

Konstantin
--------
Big Data / Search Consultant
LinkedIn: linkedin.com/in/kpelykh <http://www.linkedin.com/in/kpelykh>
Website: www.kpelykh.com

On Wed, Jul 29, 2015 at 9:18 PM, mohit.kaushik <mohit.kaushik@orkash.com>
wrote:

> If I am not getting you wrong, for this purpose, you can simply pre-split
> tables based on range to evenly distribute data across tablets.
>
> https://accumulo.apache.org/1.7/accumulo_user_manual.html#_pre_splitting_tables
>
>
>
>
> On 07/30/2015 07:46 AM, Konstantin Pelykh wrote:
>
> In this specific case, ingest happens only once. It's write-once,
> read-many type of application, so with such balancer I would want to
> balance tablets based on number of entities after ingest is fully complete.
>
> --------
> Big Data / Search Consultant
> Cell: +1 (646) 639-3916
> E-mail: kpelykh@gmail.com
> LinkedIn: linkedin.com/in/kpelykh <http://www.linkedin.com/in/kpelykh>
> Website: www.kpelykh.com
>
> On Wed, Jul 29, 2015 at 6:06 PM, dlmarion <dlmarion@comcast.net> wrote:
>
>> Hotspotting was the first thing that came to my mind with the proposed
>> balancer. The fservers don't keep all the K/V in memory. You are balancing
>> query and live ingest across your resources.
>>
>>
>>
>>
>>
>> -------- Original message --------
>> From: Eric Newton <eric.newton@gmail.com>
>> Date: 07/29/2015 8:46 PM (GMT-05:00)
>> To: user@accumulo.apache.org
>> Subject: Re: Entry-based TableBalancer
>>
>> To my knowledge, nobody has written such a balancer.
>>
>> In the history of the project, we started writing advanced, complicated
>> balancers that moved tablets around much too quickly, which degraded
>> performance. After that, we wrote much simpler balancers to avoid the
>> chaos. We're moving back to more complex balancers, but mostly just to
>> ensure that we aren't hotspoting, based on known ingest patterns (date
>> related, for example).
>>
>> If you write a new balancer, make it slow to move tablets, and very
>> simple.  Avoid over-optimizing tablet placement.
>>
>> -Eric
>>
>> On Wed, Jul 29, 2015 at 8:20 PM, Konstantin Pelykh <kpelykh@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I'm looking for a tablet balancer which operates based on a number of
>>> entries per tablet as opposed to a number of tablets per tablet server. My
>>> goal is to get even distribution of entries across the cluster.
>>>
>>> As an example:
>>>
>>> tablet #1  15M entries
>>> tablet #2   5M entries
>>> tablet #3   8M entries
>>>
>>> After balancing tablets I would want to get:
>>>
>>> Server 1 hosts: tablet1
>>> Server 2 hosts: tablet2, tablet3
>>>
>>> The idea is pretty simple and I believe such balancer has already been
>>> developed, so I decided to check before reinventing the wheel.
>>>
>>> Thanks!
>>> Konstantin
>>>
>>> --------
>>> Big Data / Lucene and Solr Consultant
>>> LinkedIn: linkedin.com/in/kpelykh <http://www.linkedin.com/in/kpelykh>
>>> Website: www.kpelykh.com
>>>
>>
>>
>
>
> --
>
> * Mohit Kaushik*
> Software Engineer
> A Square,Plot No. 278, Udyog Vihar, Phase 2, Gurgaon 122016, India
> *Tel:* +91 (124) 4969352 | *Fax:* +91 (124) 4033553
>
> <http://politicomapper.orkash.com>interactive social intelligence at
> work...
>
> <https://www.facebook.com/Orkash2012>
> <http://www.linkedin.com/company/orkash-services-private-limited>
> <https://twitter.com/Orkash>  <http://www.orkash.com/blog/>
> <http://www.orkash.com>
> <http://www.orkash.com> ... ensuring Assurance in complexity and
> uncertainty
>
> *This message including the attachments, if any, is a confidential
> business communication. If you are not the intended recipient it may be
> unlawful for you to read, copy, distribute, disclose or otherwise use the
> information in this e-mail. If you have received it in error or are not the
> intended recipient, please destroy it and notify the sender immediately.
> Thank you *
>

Mime
View raw message