accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Tubbs (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-3842) [UMBRELLA] Remove non-transient data from ZooKeeper
Date Tue, 26 May 2015 20:16:37 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14559782#comment-14559782
] 

Christopher Tubbs commented on ACCUMULO-3842:
---------------------------------------------

bq.  I definitely think users information should not be stored in ZK.

Do you just mean the user password/authorizations/permissions database? Those are already
pluggable and do not have to live in ZK. It would be a pain to put them inside Accumulo, creating
a bit of circular dependency which would be prone to concurrency problems, but it could be
doable.

Do you have a strong reason you can express for changing the default behavior of storing this
in ZK? Or is it just the lack of good comprehensive backup/restore tools which you've already
mentioned (which seems to me to be an easier problem to solve)?

>From my perspective, ZK seems to be a relatively solid component. Because of that, it
seems to me that burden is on any alternative to demonstrate a greater degree of reliability,
scalability, or other benefit.

> [UMBRELLA] Remove non-transient data from ZooKeeper
> ---------------------------------------------------
>
>                 Key: ACCUMULO-3842
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3842
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client, tserver
>            Reporter: Josh Elser
>             Fix For: 1.8.0
>
>
> Wanted to start brainstorming about this.
> We store a lot of persistent data in ZooKeeper that would better stored in something
backed by HDFS. ZooKeeper can be a very convenient place to store persisted data so that it's
available to all nodes, but it comes at a price and often must be asynchronously accessed
to achieve good performance.
> * Table/Namespace configuration
> * Users/Authorizations
> * Problem reports (maybe?)
> * System configuration overrides (maybe?)
> Some benefits we'd see from this:
> * Loss of ZooKeeper doesn't lose table configuration and users.
> * Greatly reduce zookeeper watchers (assume watchers=50*num_tables*num_tservers)
> * Consistent updates of table constraints and all other table properties
> The last note is the most important one IMO. The number of test issues alone that we've
had with constraints not being seen on all servers are bound to affect users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message