accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: What is the optimal number of tablets for a large table?
Date Sat, 10 Oct 2015 00:17:35 GMT
Jeff Kubina wrote:
> I read the following from the Accumulo manual on tablet merging
> <https://accumulo.apache.org/1.6/accumulo_user_manual.html#_merging_tablets>:
>
>
>     Over time, a table can get very large, so large that it has hundreds
>     of thousands of split points. Once there are enough tablets to
>     spread a table across the entire cluster, additional splits may not
>     improve performance, and may create unnecessary bookkeeping.
>
>
> So would the optimal number of tablets for a very large table be close
> to the total tservers times the total cores of the machine (or the
> worker threads the tservers are config to use--whichever is less)?
>

The general response normally given is "a few hundred of tablets" per 
server. I'm not sure if we really have a formula to compute this, 
though. I'm also curious how the things that are important to you affect 
this number (e.g. read latency, write throughput, read throughput, etc).

Would be neat to think through the math and create a little harness that 
actually test it out.

Mime
View raw message