chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ariel Rabkin <>
Subject Re: Alerting framework - feature idea
Date Wed, 04 Nov 2009 02:12:43 GMT
We're using the SocketTeeWriter to watch web server logs and using
that for near-real-time provisioning.  The data rate is quite small,
and some error is OK.  We could have done it without Chukwa, but we
like having a durable copy for later analysis.


On Tue, Nov 3, 2009 at 7:17 PM, Jerome Boulon <> wrote:
> Hi,
> I agree with Ari, the post-processing should be on another process/machine
> since we don't want to take more time/cpu/mem on the collector side.
> Ari, could you give us some details on you're using the SocketTeeWriter?
> Thanks,
>  /Jerome.
> On 11/3/09 5:27 PM, "Thushara Wijeratna" <> wrote:
>> yeah, that makes sense. i don't have a strong argument, except it
>> might be a tad bit easier to integrate alerting to the system.
>> swatch is pretty good, however, for custom processing, for each
>> pattern matched, a separate process needs to be run. if alerts are
>> rare, as is generally the case, that is not a big problem.
>> one reason i'm considering Chukwa instead of swatch is that it
>> centralizes the input logs at the collector - swatch AFAIK doesn't
>> perform any centralization of logs.
>> thanks,
>> thushara
>> On Tue, Nov 3, 2009 at 3:51 PM, Ariel Rabkin <> wrote:
>>> What you describe is certainly doable. I'm not sure what the use case
>>> is, though.
>>> The core goal for Chukwa is to facilitate MapReduce processing of
>>> logs. The idea of the SocketTeeWriter is to get a "sneak peek" at
>>> data, before it gets stored to HDFS. If collectors crash or get
>>> overloaded, data can get processed more than once by collectors. So
>>> there's a real cost to the real-time path.
>>> One of the main benefits of SocketTee is that the processing can
>>> happen in a separate process, or even on a separate machine.
>>> Integrating the pattern-matching in the pipeline is certainly doable,
>>> but it's not clear to me that that's an architecture we want to
>>> encourage or commit to.
>>> If people want Swatch, they know where to find it. What's the argument
>>> for needing to emulate it, real-time, in Chukwa?
>>> --Ari
>>> On Tue, Nov 3, 2009 at 3:48 PM, Thushara Wijeratna <> wrote:
>>>> Would it be useful to provide something similar to the Swatch Log
>>>> monitoring for Chukwa?
>>>> Currently, we can listen to port 9094 (after running a
>>>> SocketTeeWriter), and handle each input line.
>>>> I'm wondering whether there will be a value add in creating some more
>>>> infra-structure code in Chukwa that will:
>>>> 1. do some regular expression parsing and filter the lines with the
>>>> alert condition(s)
>>>> 2. perform some standard actions, like email etc
>>>> 3. provide an interface to perform custom handling for the user
>>>> The basic core will be someting like this:
>>>> Interface AlertCallback {
>>>>    boolean handle(String alertExp, String line);
>>>> }
>>>> Class AlertWriter extends PipelinableWriter {
>>>>    private String[] alertExps;
>>>>    private AlertCallback alertCB;
>>>>    public AlertWriter(String[] alertExps, AlertCallback alertCB);
>>>> }
>>>> It seems like most of the plumbing is already there, exposed in
>>>> SocketTeeWriter class, for ex: Filter class.
>>>> If you all think it is a good idea, I can help with this.
>>>> thanks,
>>>> thushara
>>> --
>>> Ari Rabkin
>>> UC Berkeley Computer Science Department

Ari Rabkin
UC Berkeley Computer Science Department

View raw message