incubator-flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Crobak <joec...@gmail.com>
Subject Re: E2E mode with decorators
Date Fri, 19 Aug 2011 13:33:12 GMT
Thanks Jon.  This makes perfect sense and helps a lot.  It seems to make a
lot of sense to change the decorator we've been working on to add attributes
to the events rather than replacing them outright.

Joe

On Fri, Aug 19, 2011 at 3:45 AM, Jonathan Hsieh <jon@cloudera.com> wrote:

> The acks are generated from checksums of the body of events.  So if you
> augment your events with new attributes (regex, value) the acks will still
> work.  However, if you filter out events the checksums between the agentSink
> and  the collectorSink the checksums won't sum up.
>
> You can however, put filtering "after" the collector, or do filtering "next
> to" the collector.
>
> Ok because value adds attributes and does not modify the body.
> node : <source> |  agentE2ESink("ip of collector");
> collector: collectorSource | value("newattr","newvalue")
> collectorSink("hdfs://xxxx", ...);
>
> Ok because filter is before checksums calculated
> node : <source> | filterOutEvents agentE2ESink("ip of collector");
> collector: collectorSource | collectorSink("hdfs://xxxx", ...);
>
> Ok because filter is after checksums are validated.
> node : <source> | agentE2ESink("ip of collector");
> collector: collectorSource | collector(xxx) { filterOutEvents
> escapedFormatDfs("hdfs://xxxx", ...) } ;
>
> Not ok -- checksums won't work out because events with checksum info never
> get checksum calculation.
> node : <source> | agentE2ESink("ip of collector");
> collector: collectorSource | filterOutEvents collectorSink("hdfs://xxxx",
> ...);
>
> Does that make sense?
>
> Jon.
>
>
> On Wed, Aug 17, 2011 at 2:39 AM, Bao Thai Ngo <baothaingo@gmail.com>wrote:
>
>> Hi,
>>
>> As far as I understand ACK mechanism should work regardless any decorator
>> deployed at Collector as Mingje said. I developed and deployed several
>> plug-ins (decorators) that filter out events at Collector side and they work
>> well with ACK. Another thing I can suggest is: do not try to develop an ACK
>> events part in your decorator.
>>
>> @Felix: Some advantages for deploying a decorator at collector side are:
>> - do not depend on agent side
>> - collect data we need and save other data for future needs (what we need
>> is just a small part of a very huge data)
>>
>> just my 2cent.
>>
>> ~Thai
>>
>>
>> On Tue, Aug 16, 2011 at 9:52 PM, Joe Crobak <joecrow@gmail.com> wrote:
>>
>>> According to the Flume FAQ [1], Flume ack's events from the CollectorSink
>>> in E2E mode.  If I have a Decorator running on the Collector that filters
>>> out events (or transforms them or something), does that mean those events
>>> won't get ACK'd and thus will delivery will be retried for them
>>> indefinitely? IOW, is E2E mode unsupported in this situation -- or maybe is
>>> there a way for me to ACK events that I want to filter from the Decorator
>>> itself?
>>>
>>> Thanks,
>>> Joe
>>>
>>>
>>> [1] https://github.com/cloudera/flume/wiki/FAQ
>>>
>>
>>
>
>
> --
> // Jonathan Hsieh (shay)
> // Software Engineer, Cloudera
> // jon@cloudera.com
>
>
>

Mime
View raw message