flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anat Rozenzon <a...@viber.com>
Subject Re: Problem Events
Date Thu, 01 Aug 2013 09:42:54 GMT
The message is already in the channel.
Is there a way to write an interceptor to work after the channel? or before
the sink?

The only thing I found is to stop everything and delete the channel files,
but I won't be able to use this approach in production :-(


On Thu, Aug 1, 2013 at 11:13 AM, Ashish <paliwalashish@gmail.com> wrote:

>
>
>
> On Thu, Aug 1, 2013 at 1:29 PM, Anat Rozenzon <anat@viber.com> wrote:
>
>> Hi,
>>
>> I'm having the same problem with HDFS sink.
>>
>> A 'poison' message which doesn't have timestamp header in it as the sink
>> expects.
>> This causes a NPE which ends in returning the message to the channel ,
>> over and over again.
>>
>> Is my only option to re-write the HDFS sink?
>> Isn't there any way to intercept in the sink work?
>>
>
> You can write a custom interceptor and remove/modify the poison message.
>
> Interceptors are called before message makes it way into the channel.
>
> http://flume.apache.org/FlumeUserGuide.html#flume-interceptors
>
> I wrote a blog about it a while back
> http://www.ashishpaliwal.com/blog/2013/06/flume-cookbook-implementing-custom-interceptors/
>
>
>
>>
>> Thanks
>> Anat
>>
>>
>> On Fri, Jul 26, 2013 at 3:35 AM, Arvind Prabhakar <arvind@apache.org>wrote:
>>
>>> Sounds like a bug in ElasticSearch sink to me. Do you mind filing a Jira
>>> to track this? Sample data to cause this would be even better.
>>>
>>> Regards,
>>> Arvind Prabhakar
>>>
>>>
>>> On Thu, Jul 25, 2013 at 9:50 AM, Jeremy Karlson <jeremykarlson@gmail.com
>>> > wrote:
>>>
>>>> This was using the provided ElasticSearch sink.  The logs were not
>>>> helpful.  I ran it through with the debugger and found the source of the
>>>> problem.
>>>>
>>>> ContentBuilderUtil uses a very "aggressive" method to determine if the
>>>> content is JSON; if it contains a "{" anywhere in it, it's considered JSON.
>>>>  My body contained that but wasn't JSON, causing the JSON parser to throw
a
>>>> CharConversionException from addComplexField(...) (but not the expected
>>>> JSONException).  We've changed addComplexField(...) to catch different
>>>> types of exceptions and fall back to treating it as a simple field.  We'll
>>>> probably submit a patch for this soon.
>>>>
>>>> I'm reasonably happy with this, but I still think that in the bigger
>>>> picture there should be some sort of mechanism to automatically detect and
>>>> toss / skip / flag problematic events without them plugging up the flow.
>>>>
>>>> -- Jeremy
>>>>
>>>>
>>>> On Wed, Jul 24, 2013 at 7:51 PM, Arvind Prabhakar <arvind@apache.org>wrote:
>>>>
>>>>> Jeremy, would it be possible for you to show us logs for the part
>>>>> where the sink fails to remove an event from the channel? I am assuming
>>>>> this is a standard sink that Flume provides and not a custom one.
>>>>>
>>>>> The reason I ask is because sinks do not introspect the event, and
>>>>> hence there is no reason why it will fail during the event's removal.
It is
>>>>> more likely that there is a problem within the channel in that it cannot
>>>>> dereference the event correctly. Looking at the logs will help us identify
>>>>> the root cause for what you are experiencing.
>>>>>
>>>>> Regards,
>>>>> Arvind Prabhakar
>>>>>
>>>>>
>>>>> On Wed, Jul 24, 2013 at 3:56 PM, Jeremy Karlson <
>>>>> jeremykarlson@gmail.com> wrote:
>>>>>
>>>>>> Both reasonable suggestions.  What would a custom sink look like
in
>>>>>> this case, and how would I only eliminate the problem events since
I don't
>>>>>> know what they are until they are attempted by the "real" sink?
>>>>>>
>>>>>> My philosophical concern (in general) is that we're taking the
>>>>>> approach of exhaustively finding and eliminating possible failure
cases.
>>>>>>  It's not possible to eliminate every single failure case, so shouldn't
>>>>>> there be a method of last resort to eliminate problem events from
the
>>>>>> channel?
>>>>>>
>>>>>> -- Jeremy
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 24, 2013 at 3:45 PM, Hari Shreedharan <
>>>>>> hshreedharan@cloudera.com> wrote:
>>>>>>
>>>>>>> Or you could write a custom sink that removes this event (more
work
>>>>>>> of course)
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Hari
>>>>>>>
>>>>>>> On Wednesday, July 24, 2013 at 3:36 PM, Roshan Naik wrote:
>>>>>>>
>>>>>>> if you have a way to identify such events.. you may be able to
use
>>>>>>> the Regex interceptor to toss them out before they get into the
channel.
>>>>>>>
>>>>>>>
>>>>>>>  On Wed, Jul 24, 2013 at 2:52 PM, Jeremy Karlson <
>>>>>>> jeremykarlson@gmail.com> wrote:
>>>>>>>
>>>>>>> Hi everyone.  My Flume adventures continue.
>>>>>>>
>>>>>>> I'm in a situation now where I have a channel that's filling
because
>>>>>>> a stubborn message is stuck.  The sink won't accept it (for whatever
>>>>>>> reason; I can go into detail but that's not my point here). 
This just
>>>>>>> blocks up the channel entirely, because it goes back into the
channel when
>>>>>>> the sink refuses.  Obviously, this isn't ideal.
>>>>>>>
>>>>>>> I'm wondering what mechanisms, if any, Flume has to deal with
these
>>>>>>> situations.  Things that come to mind might be:
>>>>>>>
>>>>>>> 1. Ditch the event after n attempts.
>>>>>>> 2. After n attempts, send the event to a "problem area" (maybe
a
>>>>>>> different source / sink / channel?)  that someone can look at
later.
>>>>>>> 3. Some sort of mechanism that allows operators to manually kill
>>>>>>> these messages.
>>>>>>>
>>>>>>> I'm open to suggestions on alternatives as well.
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> -- Jeremy
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
>
> --
> thanks
> ashish
>
> Blog: http://www.ashishpaliwal.com/blog
> My Photo Galleries: http://www.pbase.com/ashishpaliwal
>

Mime
View raw message