accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-3842) [UMBRELLA] Remove non-transient data from ZooKeeper
Date Tue, 26 May 2015 14:14:18 GMT


Josh Elser commented on ACCUMULO-3842:

bq.    Loss of ZooKeeper doesn't lose table configuration and users
Is this a genuine problem for users?

IMO, we don't have enough rigor here to actually make me comfortable to tell someone that
we have the means to backup and restore ZooKeeper. I know we have some tools, but I don't
know how much this is actually used and tested.

Ideally, it would be better if we can have a single way to backup data for an Accumulo cluster
and later restore it (import/export table is a likely candidate for simplicity). Perhaps removing
table configuration in ZK isn't a good idea, but I definitely think users information should
not be stored in ZK.

> [UMBRELLA] Remove non-transient data from ZooKeeper
> ---------------------------------------------------
>                 Key: ACCUMULO-3842
>                 URL:
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client, tserver
>            Reporter: Josh Elser
>             Fix For: 1.8.0
> Wanted to start brainstorming about this.
> We store a lot of persistent data in ZooKeeper that would better stored in something
backed by HDFS. ZooKeeper can be a very convenient place to store persisted data so that it's
available to all nodes, but it comes at a price and often must be asynchronously accessed
to achieve good performance.
> * Table/Namespace configuration
> * Users/Authorizations
> * Problem reports (maybe?)
> * System configuration overrides (maybe?)
> Some benefits we'd see from this:
> * Loss of ZooKeeper doesn't lose table configuration and users.
> * Greatly reduce zookeeper watchers (assume watchers=50*num_tables*num_tservers)
> * Consistent updates of table constraints and all other table properties
> The last note is the most important one IMO. The number of test issues alone that we've
had with constraints not being seen on all servers are bound to affect users.

This message was sent by Atlassian JIRA

View raw message