Return-Path: Delivered-To: apmail-incubator-chukwa-dev-archive@www.apache.org Received: (qmail 22038 invoked from network); 12 Oct 2010 18:33:59 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 12 Oct 2010 18:33:59 -0000 Received: (qmail 41240 invoked by uid 500); 12 Oct 2010 18:33:59 -0000 Delivered-To: apmail-incubator-chukwa-dev-archive@incubator.apache.org Received: (qmail 41224 invoked by uid 500); 12 Oct 2010 18:33:58 -0000 Mailing-List: contact chukwa-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: chukwa-dev@incubator.apache.org Delivered-To: mailing list chukwa-dev@incubator.apache.org Received: (qmail 41216 invoked by uid 99); 12 Oct 2010 18:33:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Oct 2010 18:33:58 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Oct 2010 18:33:56 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o9CIXZbE028124 for ; Tue, 12 Oct 2010 18:33:35 GMT Message-ID: <23394856.100791286908415144.JavaMail.jira@thor> Date: Tue, 12 Oct 2010 14:33:35 -0400 (EDT) From: "Bill Graham (JIRA)" To: chukwa-dev@incubator.apache.org Subject: [jira] Created: (CHUKWA-534) Improve fault-tolerance of DemuxManager. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org Improve fault-tolerance of DemuxManager. ---------------------------------------- Key: CHUKWA-534 URL: https://issues.apache.org/jira/browse/CHUKWA-534 Project: Chukwa Issue Type: Improvement Reporter: Bill Graham Assignee: Bill Graham If the DemuxManager received more than 5 consecutive errors, it dies with the message "Too many errors, Bail out!". Let's change to this introduce a configurable number of concurrent exceptions to be encountered before dying. If the value is set to -1, expected behavior is to keep retrying ad infinitum. Also as part if this bug is to improve logging of how many consecutive errors have occurred, as well as the time they started. A possible future enhancement could be to support an error time threshold as well as an absolute count. Suggesting the following new config setting. It's a bit verbose, but it's clear. {noformat} chukwa.demux.max.error.count.before.shutdown {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.