Return-Path: X-Original-To: apmail-accumulo-notifications-archive@minotaur.apache.org Delivered-To: apmail-accumulo-notifications-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C63AE11D91 for ; Tue, 22 Apr 2014 17:20:19 +0000 (UTC) Received: (qmail 59368 invoked by uid 500); 22 Apr 2014 17:20:17 -0000 Delivered-To: apmail-accumulo-notifications-archive@accumulo.apache.org Received: (qmail 59337 invoked by uid 500); 22 Apr 2014 17:20:17 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 59239 invoked by uid 99); 22 Apr 2014 17:20:15 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Apr 2014 17:20:15 +0000 Date: Tue, 22 Apr 2014 17:20:15 +0000 (UTC) From: "Mike Drob (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (ACCUMULO-2716) Duplicate connection loss logging in Writer MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ACCUMULO-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Drob updated ACCUMULO-2716: -------------------------------- Description: Running CI with agitation, I see lots of duplicated messages in the monitor whenever a tserver dies. | WARN | Error connecting to tserver1.example.com:10011: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused | | ERROR | error sending update to tserver1.example.com:10011: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused | These always occur in pairs, at the same millisecond, and coming from the same tserver. I _think_ that they are updates to the metadata table coming from these tservers, like flushes or compactions that fail because the dead server was hosting the corresponding metadata tablet, but it doesn't really matter. The culprit is in Writer.java where we log-and-rethrow in {{updateServer()}}: {code} } catch (TTransportException e) { log.warn("Error connecting to " + server + ": " + e); throw e; } {code} and then later log again in {{update()}}: {code} } catch (TException e) { log.error("error sending update to " + tabLoc.tablet_location + ": " + e); TabletLocator.getLocator(instance, table).invalidateCache(tabLoc.tablet_extent); } {code} was: Running CI with agitation, I see lots of duplicated messages in the monitor whenever a tserver dies. | WARN | Error connecting to tserver1.example.com:10011: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused | | ERROR | error sending update to a2422.halxg.cloudera.com:10011: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused | These always occur in pairs, at the same millisecond, and coming from the same tserver. I _think_ that they are updates to the metadata table coming from these tservers, like flushes or compactions that fail because the dead server was hosting the corresponding metadata tablet, but it doesn't really matter. The culprit is in Writer.java where we log-and-rethrow in {{updateServer()}}: {code} } catch (TTransportException e) { log.warn("Error connecting to " + server + ": " + e); throw e; } {code} and then later log again in {{update()}}: {code} } catch (TException e) { log.error("error sending update to " + tabLoc.tablet_location + ": " + e); TabletLocator.getLocator(instance, table).invalidateCache(tabLoc.tablet_extent); } {code} > Duplicate connection loss logging in Writer > ------------------------------------------- > > Key: ACCUMULO-2716 > URL: https://issues.apache.org/jira/browse/ACCUMULO-2716 > Project: Accumulo > Issue Type: Bug > Components: client > Reporter: Mike Drob > Assignee: Mike Drob > Labels: logging > > Running CI with agitation, I see lots of duplicated messages in the monitor whenever a tserver dies. > | WARN | Error connecting to tserver1.example.com:10011: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused | > | ERROR | error sending update to tserver1.example.com:10011: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused | > These always occur in pairs, at the same millisecond, and coming from the same tserver. I _think_ that they are updates to the metadata table coming from these tservers, like flushes or compactions that fail because the dead server was hosting the corresponding metadata tablet, but it doesn't really matter. > The culprit is in Writer.java where we log-and-rethrow in {{updateServer()}}: > {code} > } catch (TTransportException e) { > log.warn("Error connecting to " + server + ": " + e); > throw e; > } > {code} > and then later log again in {{update()}}: > {code} > } catch (TException e) { > log.error("error sending update to " + tabLoc.tablet_location + ": " + e); > TabletLocator.getLocator(instance, table).invalidateCache(tabLoc.tablet_extent); > } > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)