chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ari Rabkin (JIRA)" <>
Subject [jira] Commented: (CHUKWA-534) Improve fault-tolerance of DemuxManager, PostProcessManager and ChukwaArchiveManager.
Date Fri, 15 Oct 2010 17:56:53 GMT


Ari Rabkin commented on CHUKWA-534:

Looks good.  +1 to commit it.

> Improve fault-tolerance of DemuxManager, PostProcessManager and ChukwaArchiveManager.
> -------------------------------------------------------------------------------------
>                 Key: CHUKWA-534
>                 URL:
>             Project: Chukwa
>          Issue Type: Improvement
>            Reporter: Bill Graham
>            Assignee: Bill Graham
>         Attachments: CHUKWA-534_1.patch, CHUKWA-534_2.patch, CHUKWA-534_3.patch
> If the any of these processes receives more than N consecutive errors, it dies with the
message "Too many errors, Bail out!".
> Let's change to this introduce a configurable number of concurrent exceptions to be encountered
before dying. If the value is set to -1, expected behavior is to keep retrying ad infinitum.
> Also as part if this bug is to improve logging of how many consecutive errors have occurred,
as well as the time they started. A possible future enhancement could be to support an error
time threshold as well as an absolute count.
> Suggesting the following new config setting. It's a bit verbose, but it's clear.
> {noformat}
> demux.max.error.count.before.shutdown
> post.process.max.error.count.before.shutdown
> archive.max.error.count.before.shutdown
> {noformat}

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message