Return-Path: X-Original-To: apmail-accumulo-notifications-archive@minotaur.apache.org Delivered-To: apmail-accumulo-notifications-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D68AE17881 for ; Tue, 4 Nov 2014 19:43:35 +0000 (UTC) Received: (qmail 28623 invoked by uid 500); 4 Nov 2014 19:43:34 -0000 Delivered-To: apmail-accumulo-notifications-archive@accumulo.apache.org Received: (qmail 28545 invoked by uid 500); 4 Nov 2014 19:43:34 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 28150 invoked by uid 99); 4 Nov 2014 19:43:34 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Nov 2014 19:43:34 +0000 Date: Tue, 4 Nov 2014 19:43:34 +0000 (UTC) From: "Josh Elser (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (ACCUMULO-3296) Infinite ZK retry loop somewhere MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Josh Elser created ACCUMULO-3296: ------------------------------------ Summary: Infinite ZK retry loop somewhere Key: ACCUMULO-3296 URL: https://issues.apache.org/jira/browse/ACCUMULO-3296 Project: Accumulo Issue Type: Bug Components: master Reporter: Josh Elser Assignee: Josh Elser Fix For: 1.6.2, 1.7.0 ShutdownIT-shutdownDuringQuery failed. The end of the master log had the following: {noformat} 2014-11-04 09:47:56,220 [master.LiveTServerSet] INFO : Removing zookeeper lock for tserver:39492[1497a3301100002] 2014-11-04 09:47:56,243 [zookeeper.Retry] DEBUG: Sleeping for 250ms before retrying operation 2014-11-04 09:47:56,494 [zookeeper.Retry] DEBUG: Sleeping for 250ms before retrying operation 2014-11-04 09:47:56,745 [zookeeper.Retry] DEBUG: Sleeping for 250ms before retrying operation 2014-11-04 09:47:56,996 [zookeeper.Retry] DEBUG: Sleeping for 250ms before retrying operation 2014-11-04 09:47:57,247 [zookeeper.Retry] DEBUG: Sleeping for 250ms before retrying operation 2014-11-04 09:47:57,498 [zookeeper.Retry] DEBUG: Sleeping for 250ms before retrying operation 2014-11-04 09:47:57,749 [zookeeper.Retry] DEBUG: Sleeping for 250ms before retrying operation 2014-11-04 09:47:58,000 [zookeeper.Retry] DEBUG: Sleeping for 250ms before retrying operation 2014-11-04 09:47:58,252 [zookeeper.Retry] DEBUG: Sleeping for 250ms before retrying operation 2014-11-04 09:47:58,503 [zookeeper.Retry] DEBUG: Sleeping for 250ms before retrying operation 2014-11-04 09:47:58,754 [zookeeper.Retry] DEBUG: Sleeping for 250ms before retrying operation 2014-11-04 09:47:59,006 [zookeeper.Retry] DEBUG: Sleeping for 250ms before retrying operation 2014-11-04 09:47:59,257 [zookeeper.Retry] DEBUG: Sleeping for 250ms before retrying operation 2014-11-04 09:47:59,508 [zookeeper.Retry] DEBUG: Sleeping for 250ms before retrying operation 2014-11-04 09:47:59,759 [zookeeper.Retry] DEBUG: Sleeping for 250ms before retrying operation 2014-11-04 09:48:00,011 [zookeeper.Retry] DEBUG: Sleeping for 250ms before retrying operation 2014-11-04 09:48:00,262 [zookeeper.Retry] DEBUG: Sleeping for 250ms before retrying operation 2014-11-04 09:48:00,513 [zookeeper.Retry] DEBUG: Sleeping for 250ms before retrying operation {noformat} The Retry log message kept repeating until the test timed out. Every invocation of that sleep, should also include a message with the exception that was caught which caused us to perform this retry. It seems likely that recursiveDelete isn't doing something correctly given that was the last thing the Master was about to do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)