accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Medinets <david.medin...@gmail.com>
Subject Re: Determining tablets assigned to table splits, and the number of rows in each tablet
Date Sat, 04 Oct 2014 16:04:35 GMT
I did some work to explain these topics at
https://github.com/medined/D4M_Schema/blob/master/docs/data_distribution.md.
If you have the luxury of writing the ingest code you can use
Cardinality Estimates using techniques described in
https://github.com/medined/D4M_Schema/blob/master/docs/cardinality.md.

On Sat, Oct 4, 2014 at 12:23 AM, Dylan Hutchison <dhutchis@stevens.edu> wrote:
> This is for Accumulo 1.6.  Suppose we have the table splits
>
> c
>
> g
>
> w
>
>
> Does anyone know how to determine
>
> the number of tablets assigned to each table split range?
> For this example, this is the number of tablets in the ranges (-Inf,c),
> (c,g), (g,w), (w,Inf).  Or is the design 1-1, that is, for each table split
> range there is exactly one tablet?
> the number of rows inside all the tablets occupying a table split range?
> For this example, this is the total number of rows among all tablets in the
> ranges (-Inf,c), (c,g), (g,w), (w,Inf).
>
> We use this count to verify how well manually set table splits are load
> balancing in the tables.
>
> Some context: I wrote functions that found these numbers two years ago
> working on D4M in Accumulo 1.5.  I took the dark route of using non-public
> Accumulo API to get TabletServer information, get TabletStats information,
> and find the matchings to a table's splits by scanning the extents listed in
> the METATABLE.  I can share the code if anyone is curious.  It's not pretty,
> but it did the job.
>
> Moving forward as we aim to upgrade to Accumulo 1.6, we should determine the
> tablet split information the right way, not by reverse engineering Accumulo.
> Any suggestions?
>
> Thanks,
> Dylan Hutchison
>
> --
> www.cs.stevens.edu/~dhutchis

Mime
View raw message