chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bill Graham (JIRA)" <>
Subject [jira] Updated: (CHUKWA-534) Improve fault-tolerance of DemuxManager, PostProcessManager and ChukwaArchiveManager.
Date Wed, 20 Oct 2010 20:24:24 GMT


Bill Graham updated CHUKWA-534:

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Thanks Ari, committed.

> Improve fault-tolerance of DemuxManager, PostProcessManager and ChukwaArchiveManager.
> -------------------------------------------------------------------------------------
>                 Key: CHUKWA-534
>                 URL:
>             Project: Chukwa
>          Issue Type: Improvement
>            Reporter: Bill Graham
>            Assignee: Bill Graham
>         Attachments: CHUKWA-534_1.patch, CHUKWA-534_2.patch, CHUKWA-534_3.patch
> If the any of these processes receives more than N consecutive errors, it dies with the
message "Too many errors, Bail out!".
> Let's change to this introduce a configurable number of concurrent exceptions to be encountered
before dying. If the value is set to -1, expected behavior is to keep retrying ad infinitum.
> Also as part if this bug is to improve logging of how many consecutive errors have occurred,
as well as the time they started. A possible future enhancement could be to support an error
time threshold as well as an absolute count.
> Suggesting the following new config setting. It's a bit verbose, but it's clear.
> {noformat}
> demux.max.error.count.before.shutdown
> post.process.max.error.count.before.shutdown
> archive.max.error.count.before.shutdown
> {noformat}

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message