Return-Path: X-Original-To: apmail-accumulo-notifications-archive@minotaur.apache.org Delivered-To: apmail-accumulo-notifications-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D5FE7187AF for ; Sat, 23 May 2015 19:19:17 +0000 (UTC) Received: (qmail 39368 invoked by uid 500); 23 May 2015 19:19:17 -0000 Delivered-To: apmail-accumulo-notifications-archive@accumulo.apache.org Received: (qmail 39326 invoked by uid 500); 23 May 2015 19:19:17 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 39315 invoked by uid 99); 23 May 2015 19:19:17 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 23 May 2015 19:19:17 +0000 Date: Sat, 23 May 2015 19:19:17 +0000 (UTC) From: "Josh Elser (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ACCUMULO-3842) [UMBRELLA] Remove non-transient data from ZooKeeper MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ACCUMULO-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14557486#comment-14557486 ] Josh Elser commented on ACCUMULO-3842: -------------------------------------- Caught up one some procv2 HBase stuff HBASE-13571 deals with schema updates. This is done by re-opening every region. That isn't relevant for what we're talking about here. HBASE-13687 and HBASE-13688 mention that there is a "missing piece" that might be relevant, but the information is lacking. On the original design docs, I see {quote} Multi-Machine Procedures and Timeouts Operations like Snapshots or ACLs cache updates requires a bit of coordination across multiple machine. To do that the procedure will send a message (may be done as poll via heartbeat) to each machine required by the procedure and will wait until each one respond. The procedure can have a timeout that will trigger a failure of the procedure causing the rollback. {quote} This doesn't seem like anything novel that trying to adopt procv2 would gain us that we couldn't already do with FATE. I'm happy to entertain a conversation if I missed something, but, from what I've read so far, I don't see a reason why we'd want to adopt procv2 presently. > [UMBRELLA] Remove non-transient data from ZooKeeper > --------------------------------------------------- > > Key: ACCUMULO-3842 > URL: https://issues.apache.org/jira/browse/ACCUMULO-3842 > Project: Accumulo > Issue Type: Improvement > Components: client, tserver > Reporter: Josh Elser > Fix For: 1.8.0 > > > Wanted to start brainstorming about this. > We store a lot of persistent data in ZooKeeper that would better stored in something backed by HDFS. ZooKeeper can be a very convenient place to store persisted data so that it's available to all nodes, but it comes at a price and often must be asynchronously accessed to achieve good performance. > * Table/Namespace configuration > * Users/Authorizations > * Problem reports (maybe?) > * System configuration overrides (maybe?) > Some benefits we'd see from this: > * Loss of ZooKeeper doesn't lose table configuration and users. > * Greatly reduce zookeeper watchers (assume watchers=50*num_tables*num_tservers) > * Consistent updates of table constraints and all other table properties > The last note is the most important one IMO. The number of test issues alone that we've had with constraints not being seen on all servers are bound to affect users. -- This message was sent by Atlassian JIRA (v6.3.4#6332)