flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hari Shreedharan" <hshreedha...@cloudera.com>
Subject Re: Delete individual message from queue
Date Tue, 10 Feb 2015 00:26:44 GMT
Correct - that would be pretty tricky. We could indeed modify the tool to take a custom function
to process each event - that would work. We must specify an interface that the user must implement,
say, FileChannelDataVerifier or something. We then call this on each event. 




Do you want to take a stab at it?




Thanks, Hari

On Wed, Feb 4, 2015 at 7:27 PM, Ashish <paliwalashish@gmail.com> wrote:

> I think I was not clear. I was taking about an offline tool which
> would help us clean the Channel with valid Flume Event, with
> invalid/malformed payload. What you described would work in Agent
> flow. Given the current scenario, Charles would be keen to push the
> backed up events to destination.
> Here is what I was taking about
> offline tool -> read event from file channel -> event not corrupt ->
> decodePayload (user supplied) -> if valid write to file channel else
> drop the event
> As far as my reading goes, everything on file channel is stored a
> TransactionEvent, so this might not be as simple as it looks. In
> another words, it would be like a FileChannelSink with user provided
> function to validate payload.
> On Thu, Feb 5, 2015 at 1:02 AM, Hari Shreedharan
> <hshreedharan@cloudera.com> wrote:
>> It is not as easy, since the channel does not know what an event looks like
>> or why the transaction is being rolled back. That is something that is being
>> handled by the serializer and sink. We need to some how remove an event from
>> the channel if it the channel is broken. We might perhaps have to add a new
>> interface that allows the sink/serializer to tell the channel “forget”a
>> specific event, and that event will be dropped from the channel and
>> transaction.
>>
>> I don’t see how we can do it outside the sink for this reason.
>>
>> Thanks,
>> Hari
>>
>>
>> On Wed, Feb 4, 2015 at 5:32 AM, Ashish <paliwalashish@gmail.com> wrote:
>>>
>>> Is it possible to extend File Channel Integrity tool to support
>>> filtering out corrupt events? Something like once we get a record and
>>> it's not corrupt, provide a placeholder function to validate Event
>>> implemented by user. I still don't have too much insight into
>>> FileChannel implementation.
>>>
>>> On Tue, Feb 3, 2015 at 12:49 AM, Hari Shreedharan
>>> <hshreedharan@cloudera.com> wrote:
>>> > Currently, no - there is no such tool, but this is a request that has
>>> > come
>>> > up time and again. Can you file a jira for this? If someone has time,
>>> > they’d
>>> > probably pick it up
>>> >
>>> > Thanks,
>>> > Hari
>>> >
>>> >
>>> > On Mon, Feb 2, 2015 at 10:52 AM, Charles McLaughlin
>>> > <charles@nextdoor.com>
>>> > wrote:
>>> >>
>>> >> Hello,
>>> >>
>>> >> We had a situation where one of our Flume agents got stuck on a message
>>> >> due to unexpected format. To get things moving again, I stopped the
>>> >> Flume
>>> >> agent, moved the file backed channel data out of the way and re-started
>>> >> the
>>> >> Flume agent. I'd like to pop the bad message from the queue data on
>>> >> disk...
>>> >> are there any tools or recommended ways to do this?
>>> >>
>>> >> Thanks,
>>> >> Charles
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> thanks
>>> ashish
>>>
>>> Blog: http://www.ashishpaliwal.com/blog
>>> My Photo Galleries: http://www.pbase.com/ashishpaliwal
>>
>>
> -- 
> thanks
> ashish
> Blog: http://www.ashishpaliwal.com/blog
> My Photo Galleries: http://www.pbase.com/ashishpaliwal
Mime
View raw message