incubator-flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Hsieh <...@cloudera.com>
Subject Re: E2E mode with decorators
Date Fri, 19 Aug 2011 07:45:22 GMT
The acks are generated from checksums of the body of events.  So if you
augment your events with new attributes (regex, value) the acks will still
work.  However, if you filter out events the checksums between the agentSink
and  the collectorSink the checksums won't sum up.

You can however, put filtering "after" the collector, or do filtering "next
to" the collector.

Ok because value adds attributes and does not modify the body.
node : <source> |  agentE2ESink("ip of collector");
collector: collectorSource | value("newattr","newvalue")
collectorSink("hdfs://xxxx", ...);

Ok because filter is before checksums calculated
node : <source> | filterOutEvents agentE2ESink("ip of collector");
collector: collectorSource | collectorSink("hdfs://xxxx", ...);

Ok because filter is after checksums are validated.
node : <source> | agentE2ESink("ip of collector");
collector: collectorSource | collector(xxx) { filterOutEvents
escapedFormatDfs("hdfs://xxxx", ...) } ;

Not ok -- checksums won't work out because events with checksum info never
get checksum calculation.
node : <source> | agentE2ESink("ip of collector");
collector: collectorSource | filterOutEvents collectorSink("hdfs://xxxx",
...);

Does that make sense?

Jon.


On Wed, Aug 17, 2011 at 2:39 AM, Bao Thai Ngo <baothaingo@gmail.com> wrote:

> Hi,
>
> As far as I understand ACK mechanism should work regardless any decorator
> deployed at Collector as Mingje said. I developed and deployed several
> plug-ins (decorators) that filter out events at Collector side and they work
> well with ACK. Another thing I can suggest is: do not try to develop an ACK
> events part in your decorator.
>
> @Felix: Some advantages for deploying a decorator at collector side are:
> - do not depend on agent side
> - collect data we need and save other data for future needs (what we need
> is just a small part of a very huge data)
>
> just my 2cent.
>
> ~Thai
>
>
> On Tue, Aug 16, 2011 at 9:52 PM, Joe Crobak <joecrow@gmail.com> wrote:
>
>> According to the Flume FAQ [1], Flume ack's events from the CollectorSink
>> in E2E mode.  If I have a Decorator running on the Collector that filters
>> out events (or transforms them or something), does that mean those events
>> won't get ACK'd and thus will delivery will be retried for them
>> indefinitely? IOW, is E2E mode unsupported in this situation -- or maybe is
>> there a way for me to ACK events that I want to filter from the Decorator
>> itself?
>>
>> Thanks,
>> Joe
>>
>>
>> [1] https://github.com/cloudera/flume/wiki/FAQ
>>
>
>


-- 
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com

Mime
View raw message