accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Tubbs (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-3147) Replication table should be user-controlled or live in accumulo namespace
Date Fri, 31 Oct 2014 21:00:34 GMT


Christopher Tubbs commented on ACCUMULO-3147:

bq. Yeah, I wanted to focus less on the specifics of where the table came from and just make
it dynamic. I don't want this to be an issue if we make a replication namespace and suddenly
decide we want the replication table over the instead of the accumulo namespace (for example).

Describing it this way seems like an argument for making it external to Accumulo, as an add-on,
not as. In my view, it's either a system-managed table or it's a utility table declared by
some external, client system. As is, it seems like a hybrid.

bq. I will leave you to that to expose all of those scary internals out the clients . I'm
not convinced you can actually do this and properly track the replication status; however,
you are more the free to investigate it yourself.

I have no interest in even attempting to do that. My point was that this seems like it would
have been a preferable implementation, because as is, replication seems to be a hybird user-system/baked-in
system, and bloats Accumulo's internals. I'm not objecting to the point where I think we should
revert (not seriously considering that anyway), but I do think it needs a lot more polishing
before it can be released.

bq. How is it arbitrarily doing anything? There are a number of things that are always checked....

The very idea of dynamic checking of the table's existence, and proper configuration, reveals
the fact that the table is expected to be a user-managed table, not a system-managed one.
This is the problem I have with the current implementation. It's treated as a user-managed
table, but created at runtime as needed by the master, utilized by the system user for internal
system features. It's tested as a user-table, also, being created by the root user in most
tests, and deleted as needed for testing. Very little about how it's utilized reveals itself
to be part of an internal system table. System tables are always present, have well-defined
access permissions, exist in a non-conflicting system-managed namespace, have well-known table
IDs, and have strict tests to ensure they behave this way. User tables can be created on the
fly, exist anywhere except the system namespace, can be deleted by users, and have permissions
that are appropriate to the external user/system that utilizes them. The trace table is a
slight exception, because it existed before namespaces and somewhat blurs the boundaries,
but even it is primarily treated as a user-controlled table by an external service, since
the external tracer service creates the table, and uses credentials to do so, from a stored
configuration file. But, the replication table is not created by an external service. It's
created by the master, as needed, can conflict with user table names/config, which can block
replication from working, does not have a fixed tableID, and is part of an internal feature.

Don't get me wrong... I don't think this is unfixable (indeed, I'm trying to fix some of it).
I think it just needs some more polishing before release. But, it's what I meant by "arbitrarily".

> Replication table should be user-controlled or live in accumulo namespace
> -------------------------------------------------------------------------
>                 Key: ACCUMULO-3147
>                 URL:
>             Project: Accumulo
>          Issue Type: Sub-task
>          Components: replication
>            Reporter: Christopher Tubbs
>            Assignee: Christopher Tubbs
>            Priority: Blocker
>             Fix For: 1.7.0
> At present, it looks like the replication table is managed by/written to by the system
user, yet the table lives in the default namespace, which is where user tables live.
> This appears to violate the namespace model of segregating system tables from user tables.
> There's a few options for resolution:
> # Move the replication table into the reserved accumulo system namespace (there's some
complication with this, because the system namespace is currently static, and the replication
table may be created at any time; additionally, if users are expected to interact with this
table... and I'm not sure if they are at all, the system namespace is probably not appropriate).
> # Create an additional reserved system namespace for replication (my least preferred
> # Use user credentials to manage/write to this table, rather than the system user (this
is what the tracer/trace table does, and this is my preferred solution.)

This message was sent by Atlassian JIRA

View raw message