chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ari Rabkin (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CHUKWA-369) proposed reliability mechanism
Date Tue, 18 Aug 2009 21:21:14 GMT

    [ https://issues.apache.org/jira/browse/CHUKWA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744724#action_12744724
] 

Ari Rabkin commented on CHUKWA-369:
-----------------------------------

Eric:

The only state at collectors is soft, and regenerated periodically (depending on the scan
frequency, but think every few minutes.)  It's totally okay if all the collectors crash and
then restart; acks will be delayed for a while, but they'll eventually arrive.  

And all collectors share the same state, since they're ls-ing the same filesystem.  So I don't
see why flapping would be a problem. 

> proposed reliability mechanism
> ------------------------------
>
>                 Key: CHUKWA-369
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-369
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>          Components: data collection
>    Affects Versions: 0.3.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>             Fix For: 0.3.0
>
>         Attachments: delayedAcks.patch
>
>
> We like to say that Chukwa is a system for reliable log collection. It isn't, quite,
since we don't handle collector crashes.  Here's a proposed reliability mechanism.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message