flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Karlson <jeremykarl...@gmail.com>
Subject Re: Problem Events
Date Thu, 01 Aug 2013 16:26:20 GMT
To my knowledge (which is admittedly limited), there is no way to deal with
these in a way that will make your day.  I'm happy if someone can say
otherwise.

This is very similar to a problem I had a week or two ago.  I fixed it by
restarting Flume with debugging on, connecting to it with the debugger, and
finding the message in the sink.  Discover a bug in the sink.  Downloaded
Flume, fixed bug, recompiled, installed custom version, etc.

I agree that this is not a practical solution, and I still believe that
Flume needs some sort of "sink of last resort" option or something, like
JMS implementations.

-- Jeremy



On Thu, Aug 1, 2013 at 2:42 AM, Anat Rozenzon <anat@viber.com> wrote:

> The message is already in the channel.
> Is there a way to write an interceptor to work after the channel? or
> before the sink?
>
> The only thing I found is to stop everything and delete the channel files,
> but I won't be able to use this approach in production :-(
>
>
> On Thu, Aug 1, 2013 at 11:13 AM, Ashish <paliwalashish@gmail.com> wrote:
>
>>
>>
>>
>> On Thu, Aug 1, 2013 at 1:29 PM, Anat Rozenzon <anat@viber.com> wrote:
>>
>>> Hi,
>>>
>>> I'm having the same problem with HDFS sink.
>>>
>>> A 'poison' message which doesn't have timestamp header in it as the sink
>>> expects.
>>> This causes a NPE which ends in returning the message to the channel ,
>>> over and over again.
>>>
>>> Is my only option to re-write the HDFS sink?
>>> Isn't there any way to intercept in the sink work?
>>>
>>
>> You can write a custom interceptor and remove/modify the poison message.
>>
>> Interceptors are called before message makes it way into the channel.
>>
>> http://flume.apache.org/FlumeUserGuide.html#flume-interceptors
>>
>> I wrote a blog about it a while back
>> http://www.ashishpaliwal.com/blog/2013/06/flume-cookbook-implementing-custom-interceptors/
>>
>>
>>
>>>
>>> Thanks
>>> Anat
>>>
>>>
>>> On Fri, Jul 26, 2013 at 3:35 AM, Arvind Prabhakar <arvind@apache.org>wrote:
>>>
>>>> Sounds like a bug in ElasticSearch sink to me. Do you mind filing a
>>>> Jira to track this? Sample data to cause this would be even better.
>>>>
>>>> Regards,
>>>> Arvind Prabhakar
>>>>
>>>>
>>>> On Thu, Jul 25, 2013 at 9:50 AM, Jeremy Karlson <
>>>> jeremykarlson@gmail.com> wrote:
>>>>
>>>>> This was using the provided ElasticSearch sink.  The logs were not
>>>>> helpful.  I ran it through with the debugger and found the source of
the
>>>>> problem.
>>>>>
>>>>> ContentBuilderUtil uses a very "aggressive" method to determine if the
>>>>> content is JSON; if it contains a "{" anywhere in it, it's considered
JSON.
>>>>>  My body contained that but wasn't JSON, causing the JSON parser to throw
a
>>>>> CharConversionException from addComplexField(...) (but not the expected
>>>>> JSONException).  We've changed addComplexField(...) to catch different
>>>>> types of exceptions and fall back to treating it as a simple field. 
We'll
>>>>> probably submit a patch for this soon.
>>>>>
>>>>> I'm reasonably happy with this, but I still think that in the bigger
>>>>> picture there should be some sort of mechanism to automatically detect
and
>>>>> toss / skip / flag problematic events without them plugging up the flow.
>>>>>
>>>>> -- Jeremy
>>>>>
>>>>>
>>>>> On Wed, Jul 24, 2013 at 7:51 PM, Arvind Prabhakar <arvind@apache.org>wrote:
>>>>>
>>>>>> Jeremy, would it be possible for you to show us logs for the part
>>>>>> where the sink fails to remove an event from the channel? I am assuming
>>>>>> this is a standard sink that Flume provides and not a custom one.
>>>>>>
>>>>>> The reason I ask is because sinks do not introspect the event, and
>>>>>> hence there is no reason why it will fail during the event's removal.
It is
>>>>>> more likely that there is a problem within the channel in that it
cannot
>>>>>> dereference the event correctly. Looking at the logs will help us
identify
>>>>>> the root cause for what you are experiencing.
>>>>>>
>>>>>> Regards,
>>>>>> Arvind Prabhakar
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 24, 2013 at 3:56 PM, Jeremy Karlson <
>>>>>> jeremykarlson@gmail.com> wrote:
>>>>>>
>>>>>>> Both reasonable suggestions.  What would a custom sink look like
in
>>>>>>> this case, and how would I only eliminate the problem events
since I don't
>>>>>>> know what they are until they are attempted by the "real" sink?
>>>>>>>
>>>>>>> My philosophical concern (in general) is that we're taking the
>>>>>>> approach of exhaustively finding and eliminating possible failure
cases.
>>>>>>>  It's not possible to eliminate every single failure case, so
shouldn't
>>>>>>> there be a method of last resort to eliminate problem events
from the
>>>>>>> channel?
>>>>>>>
>>>>>>> -- Jeremy
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jul 24, 2013 at 3:45 PM, Hari Shreedharan <
>>>>>>> hshreedharan@cloudera.com> wrote:
>>>>>>>
>>>>>>>> Or you could write a custom sink that removes this event
(more work
>>>>>>>> of course)
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Hari
>>>>>>>>
>>>>>>>> On Wednesday, July 24, 2013 at 3:36 PM, Roshan Naik wrote:
>>>>>>>>
>>>>>>>> if you have a way to identify such events.. you may be able
to use
>>>>>>>> the Regex interceptor to toss them out before they get into
the channel.
>>>>>>>>
>>>>>>>>
>>>>>>>>  On Wed, Jul 24, 2013 at 2:52 PM, Jeremy Karlson <
>>>>>>>> jeremykarlson@gmail.com> wrote:
>>>>>>>>
>>>>>>>> Hi everyone.  My Flume adventures continue.
>>>>>>>>
>>>>>>>> I'm in a situation now where I have a channel that's filling
>>>>>>>> because a stubborn message is stuck.  The sink won't accept
it (for
>>>>>>>> whatever reason; I can go into detail but that's not my point
here).  This
>>>>>>>> just blocks up the channel entirely, because it goes back
into the channel
>>>>>>>> when the sink refuses.  Obviously, this isn't ideal.
>>>>>>>>
>>>>>>>> I'm wondering what mechanisms, if any, Flume has to deal
with these
>>>>>>>> situations.  Things that come to mind might be:
>>>>>>>>
>>>>>>>> 1. Ditch the event after n attempts.
>>>>>>>> 2. After n attempts, send the event to a "problem area" (maybe
a
>>>>>>>> different source / sink / channel?)  that someone can look
at later.
>>>>>>>> 3. Some sort of mechanism that allows operators to manually
kill
>>>>>>>> these messages.
>>>>>>>>
>>>>>>>> I'm open to suggestions on alternatives as well.
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> -- Jeremy
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> thanks
>> ashish
>>
>> Blog: http://www.ashishpaliwal.com/blog
>> My Photo Galleries: http://www.pbase.com/ashishpaliwal
>>
>
>

Mime
View raw message