Return-Path: X-Original-To: apmail-accumulo-notifications-archive@minotaur.apache.org Delivered-To: apmail-accumulo-notifications-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AF2FF18034 for ; Fri, 22 May 2015 19:10:17 +0000 (UTC) Received: (qmail 49199 invoked by uid 500); 22 May 2015 19:10:17 -0000 Delivered-To: apmail-accumulo-notifications-archive@accumulo.apache.org Received: (qmail 49159 invoked by uid 500); 22 May 2015 19:10:17 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 49136 invoked by uid 99); 22 May 2015 19:10:17 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 May 2015 19:10:17 +0000 Date: Fri, 22 May 2015 19:10:17 +0000 (UTC) From: "Josh Elser (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ACCUMULO-3842) [UMBRELLA] Remove non-transient data from ZooKeeper MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ACCUMULO-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14556650#comment-14556650 ] Josh Elser commented on ACCUMULO-3842: -------------------------------------- Thanks for taking a read over this (very) bare outline so far. bq. I'm not sure how the change proposed would manifest the third benefit you mention (consistent updates of table props). Can you explain that, please? As I understand it, we use ZooKeeper, because it has watchers, which we can use to get consistency. I'm not aware of any similar mechanism with any alternatives. So, right now, we have eventually consistent configuration updates for tables. We don't know when the watchers will fire, but (IIRC) we know they will fire in the correct order and ever server will eventually see all updates. What we should really have to mimic the API we present is a strongly consistent means to update configurations. ZooKeeper doesn't keep us from accomplishing this. We would need to write code to actually get the strong consensus for ourselves. I know this is very hand-wavy at this point, but I think we're at the point where this is a problem we need to start thinking about because it's been a repeated problem for ourselves just in writing reasonable tests for Accumulo for ~2years now. > [UMBRELLA] Remove non-transient data from ZooKeeper > --------------------------------------------------- > > Key: ACCUMULO-3842 > URL: https://issues.apache.org/jira/browse/ACCUMULO-3842 > Project: Accumulo > Issue Type: Improvement > Components: client, tserver > Reporter: Josh Elser > Fix For: 1.8.0 > > > Wanted to start brainstorming about this. > We store a lot of persistent data in ZooKeeper that would better stored in something backed by HDFS. ZooKeeper can be a very convenient place to store persisted data so that it's available to all nodes, but it comes at a price and often must be asynchronously accessed to achieve good performance. > * Table/Namespace configuration > * Users/Authorizations > * Problem reports (maybe?) > * System configuration overrides (maybe?) > Some benefits we'd see from this: > * Loss of ZooKeeper doesn't lose table configuration and users. > * Greatly reduce zookeeper watchers (assume watchers=50*num_tables*num_tservers) > * Consistent updates of table constraints and all other table properties > The last note is the most important one IMO. The number of test issues alone that we've had with constraints not being seen on all servers are bound to affect users. -- This message was sent by Atlassian JIRA (v6.3.4#6332)