accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-1719) Convenient instanceName to instanceID mapping is unnecessary
Date Mon, 09 Jun 2014 17:37:03 GMT


Eric Newton commented on ACCUMULO-1719:

Nevermind... if you don't have the old secret, you probably can't update the instance name,

> Convenient instanceName to instanceID mapping is unnecessary
> ------------------------------------------------------------
>                 Key: ACCUMULO-1719
>                 URL:
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client
>            Reporter: Christopher Tubbs
>             Fix For: 1.7.0
> ZooKeeperInstance constructor typically takes two parameters: instanceName and a comma
separated list of zookeeper host[:port] (there's some others also, that take a UUID and/or
a timeout setting).
> Initialize generates a UUID and associates a user-provided instanceName to it, with the
following mapping in ZooKeeper:
> /accumulo/instances/instanceName, which contains a UUID, which points to /accumulo/UUID
> Since the introduction of instance.secret, there are potential problems with this mapping.
> If /accumulo (and /accumulo/instances and /accumulo/instances/instanceName) is created
by Initialize in a write-protected way (using instance.secret), then re-initializing with
a new generated instanceID but the same instanceName will not work unless the new instance
has the same instance secret. This is very limiting and can be a nightmare for system administrators
and developers trying to re-initialize.
> If it is not created in a write-protected way, there's an even bigger problem, because
anybody with access to ZooKeeper can overwrite the old mapping to point to a new instance
(and we expect all clients to be able to access ZooKeeper). While the old data is still protected,
any clients connecting with the instanceName will connect (and ingest to) the new instanceID
that the instanceName currently maps to.
> The current implementation appears to be using the former... (the instanceName node itself
is protected by the same secret as the instanceId and child nodes). This means that at least
the mapping is protected from being overwritten... but it also means that it doesn't provide
us with any added value. Even if we're counting the added value of being able to reinitialize
the same instanceName (generating a new instanceID), leaving the old instance data around
for inspection, we've got the problems of ZK filling up and the fact that the mapping was
re-written, we can't tell which old instanceID was the previous one to inspect.
> A better solution:
> Drop the mapping. It is unnecessary complex with no added value. Allow the instanceName
that users create in new versions to represent the unique ID. Don't generate/use UUIDs anymore...
use the provided instanceName. Keep the API for UUID... but just for convenience (treat it
like a string internally). We can still prompt to overwrite the old instance... if it exists
AND we have the same secret... but when we "overwrite it", we can optionally rename the old
instanceName to instanceName_backup_date.
> Dropping the mapping has the benefit of reduced complexity, and (mostly) backwards-compatible
(instances can't have the name "instances"). It is easier on developers to debug their instances,
because there's no obscure UUID to deal with (unless they want to use that as the name) and
they can find the old versions of their instances if they choose to back up the old data when
re-initalizing. If not, they can avoid ZK filling up (esp. in dev environments where instanceNames
get reused often). And, with a backup naming convention, it's easy for admins to decide which
old instance data to keep and which to throw away... without the need of a mapping. The scope
for the instance.secret is also well-defined to just the /accumulo/instanceName that created
it, and there's no possibility of overwriting the instanceName to instanceID mapping.
> Instance names work best when unique. Instance IDs are guaranteed to be unique. There's
no good reason these should be separate things.

This message was sent by Atlassian JIRA

View raw message