accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher <>
Subject Re: sharding via different tables
Date Mon, 17 Aug 2015 19:21:30 GMT
I'd expect performance to be slightly better with separate tables than
locality groups, because managing locality groups can be relatively
cheap, but it's not entirely free.

Namespaces work like a table prefix, but also provide a means to
easily configure all of its tables at once. So, they're either
slightly or significantly better than a table prefix, depending on
your needs.

Christopher L Tubbs II

On Mon, Aug 17, 2015 at 3:01 PM, z11373 <> wrote:
> Thanks Christopher for valuable insight.
> Right now we don't have scenario which it needs to query data from multiple
> customers at once. Perhaps some time in the future, and that 'future' seems
> could be years from now (or perhaps never), so I think I am inclined to
> implement them as separate tables for now.
> Though they are in separate tables, I will still apply visibility column for
> each row in the table. The visibility string could be something like
> customer id. The caller will be another app of ours, so we can trust it
> (still need to pass that customer id as authz string).
> In term of scan performance, is it true that if we shard by column family or
> different table, it won't matter much since I'd think we also can create
> separate locality group for different column family)?
> Thanks for the tips on using namespace, originally I'd think of using prefix
> the table names with customer id. I guess they are no difference, right?
> Thanks,
> Z
> --
> View this message in context:
> Sent from the Developers mailing list archive at

View raw message