Return-Path: X-Original-To: apmail-accumulo-notifications-archive@minotaur.apache.org Delivered-To: apmail-accumulo-notifications-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0513D102C5 for ; Fri, 31 Jan 2014 19:24:22 +0000 (UTC) Received: (qmail 5638 invoked by uid 500); 31 Jan 2014 19:24:15 -0000 Delivered-To: apmail-accumulo-notifications-archive@accumulo.apache.org Received: (qmail 5502 invoked by uid 500); 31 Jan 2014 19:24:12 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 5473 invoked by uid 99); 31 Jan 2014 19:24:11 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 31 Jan 2014 19:24:11 +0000 Date: Fri, 31 Jan 2014 19:24:11 +0000 (UTC) From: "Christopher Tubbs (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ACCUMULO-2140) Race conditions between client operations and upgrade MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ACCUMULO-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13888040#comment-13888040 ] Christopher Tubbs commented on ACCUMULO-2140: --------------------------------------------- I did see a new trace table get re-created on upgrade, because it wasn't in the default namespace and was therefore not visible to the client. However, I fixed the bug in the code that wasn't putting it in the correct namespace. It was certainly the case that it was caused by one kind of race condition: the tracer client had a different view of zookeeper while waiting on the upgrade to occur and the client thought the table didn't exist, so it sent a request to create it. However, I now realize that the table was not re-created *during* the upgrade, but after it, because the RPC (which probably waited on the master's client service being available). It's not clear to me now why the master would've allowed this request to complete, though, but I don't think it's possible anymore, as I haven't seen it since. > Race conditions between client operations and upgrade > ----------------------------------------------------- > > Key: ACCUMULO-2140 > URL: https://issues.apache.org/jira/browse/ACCUMULO-2140 > Project: Accumulo > Issue Type: Bug > Reporter: Christopher Tubbs > Assignee: Christopher Tubbs > Priority: Blocker > Fix For: 1.6.0 > > > While the master is upgrading, it also has a thread that is responding to client requests. Since the upgrade renames tables and puts them in namespaces, there is a short period of time where table existence checks that rely on the new zookeeper schema for tables are failing to provide the correct answer. > Example: when the tracer starts, it tries to create a "trace" table, if it doesn't exist. The existence check returns false, so it creates a new trace table in the default namespace, even though there exists an old one that has not yet been moved into the default namespace during the upgrade. This results in two tables with the same name. > An easy solution would be to fail to respond to client requests until after the upgrade is complete. (eg. wait to start up the MasterClientServiceHandler thread). -- This message was sent by Atlassian JIRA (v6.1.5#6160)