flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sverre Bakke <sverre.ba...@gmail.com>
Subject Re: How to handle ChannelFullException
Date Thu, 29 Jan 2015 19:50:12 GMT
Hi,

I currently have only a single sink. I actually did not know I could
have several sinks attached to the same channel. I will try this as
well as increasing the channel size and see if I get rid of these
channelfullexceptions.

On Thu, Jan 29, 2015 at 8:45 PM, Hari Shreedharan
<hshreedharan@cloudera.com> wrote:
> How many sinks do you have? Adding more sinks increases parallelism and will
> clear the channel faster, provided the downstream system can handle the
> load.
>
> Thanks,
> Hari
>
>
> On Thu, Jan 29, 2015 at 9:41 AM, Sverre Bakke <sverre.bakke@gmail.com>
> wrote:
>>
>> Hi,
>>
>> Thanks for your feedback. I can of course switch to the multiport one
>> if the plain one is not maintained.
>>
>> Back to the ChannelFullException issue. I can increase the channel
>> size, but the basic problem remains.. as long as the syslog client is
>> faster than the Flume sink, then this exception will occur and data
>> would be lost... I really believe that blocking so that the syslog
>> client must wait to send more data is the way to go for a robust
>> solution.
>>
>> Lets assume that the syslog client reads batches of events e.g. from
>> file and send these as fast as possible to the Flume multiport tcp
>> syslog source. In such cases, the average event per second rate would
>> be medium, while in practice, there would be huge spikes where the
>> client would deliver as fast as possible. Instead of asking the client
>> to "slow down", Flume would accept the events and drop them. This
>> forces me as an admin to monitor the logs and try to guess which
>> events were dropped. If this happens, I can have a reliable and
>> persistent channel configured, but events will still be dropped thus
>> undermining the entire solution.
>>
>>
>>
>> On Thu, Jan 29, 2015 at 4:56 PM, Jeff Lord <jlord@cloudera.com> wrote:
>> > Have you considered increasing the size of the memory channel? I haven't
>> > played with Kafka sink much but in regards to hdfs we often add sinks
>> > which
>> > can help to increase the flow of the channel.
>> > The multi port Syslog source is the way to go here as it will give
>> > better
>> > performance. We should probably go ahead and deprecate the vanilla
>> > syslog
>> > source.
>> >
>> >
>> > On Thursday, January 29, 2015, Sverre Bakke <sverre.bakke@gmail.com>
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> I have a syslogtcp source using a default memory channel and Kafka
>> >> sink. When producing data as fast as possible (3000 syslog events in a
>> >> second), the source seems to accept all the data, but will crash due
>> >> to ChannelFullException when adding the event to the channel.
>> >>
>> >> Is there any way to throttle or otherwise wait receiving more syslog
>> >> events before channel is available again rather than crashing because
>> >> the channel is full? I would prefer that Flume would accept syslog
>> >> events slower rather than crashing and dropping events.
>> >>
>> >> 29 Jan 2015 16:26:56,721 ERROR [New I/O worker #2]
>> >>
>> >>
>> >> (org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.messageReceived:94)
>> >> - Error writting to channel, event dropped
>> >>
>> >> Also, the syslogtcp seems to keep the syslog headers regardless of the
>> >> keepFields setting, is there any common reason for why this might
>> >> happen? In contrast, the multiport syslog tcp listener works as
>> >> expected with this particular setting.
>
>

Mime
View raw message