flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arvind Prabhakar <arv...@apache.org>
Subject Re: Failover Sink Processor Configuration
Date Wed, 23 May 2012 22:31:59 GMT
Hi Tejinder,

The version you are using is a bit dated considering it was part of the CDH
beta release. I suggest instead to either wait for the stable release of
CDH or use the trunk version in the meantime.

For what its worth, we are rigorously testing the trunk and fixing the bugs
we find there. We would love to make your production successful and help
resolve any issues you may run into.

Thanks,
Arvind Prabhakar

On Wed, May 23, 2012 at 3:01 PM, Tejinder Aulakh <tejinder@sharethis.com>wrote:

> A gentle reminder. We are thinking about using Flume 1.1.0 in production
> soon so need a failover solution.
>
> Tejinder
>
> On Tue, May 22, 2012 at 12:15 PM, Tejinder Aulakh <tejinder@sharethis.com>wrote:
>
>> Hi Arvind,
>>
>> The flume version we are using is 1.1.0 and it looks like maxpenalty
>> option is not present in this version. I'm looking at the source code and
>> can not find maxpenalty. Was this recently added?
>>
>> Flume Version - flume-ng-1.1.0-1.cdh4.0.0b2.p0.27.el6.noarch
>>
>> So, how is the behavior in 1.1.0?
>>
>> TJ
>> On Mon, May 21, 2012 at 5:41 PM, Arvind Prabhakar <arvind@apache.org>wrote:
>>
>>> Hi,
>>>
>>> On Mon, May 21, 2012 at 5:06 PM, Tejinder Aulakh <tejinder@sharethis.com
>>> > wrote:
>>>
>>>> 1) If sink2 also goes down, does it try to reconnect back to sink1
>>>> before giving up?
>>>>
>>>
>>> In short yes. Long answer - it uses a progressive time window to
>>> penalize failing sinks upto a maximum specified ceiling. For example, the
>>> first time a sink fails, it will black list it for one second, second time
>>> for two seconds etc until the max penalty time is reached. If there are one
>>> or more failed sinks that have met their penalty period, they will be tried
>>> ahead of other sinks that may be active.
>>>
>>>
>>>
>>>>
>>>> 2) Is there any way to have flume switch back to sink1 from sink2 as
>>>> soon as it comes back online? If not, how would you switch back to the
>>>> primary sink (sink1)? Restart? that would result in loosing events sitting
>>>> in the memory channel. correct?
>>>>
>>>
>>> The switchback is automatic, although not instantaneous due to the
>>> progressive time window mechanism described above. However, once the
>>> switchback happens when the sink becomes available again, it is
>>> transparent. No need to do any restarts.
>>>
>>> Arvind
>>>
>>>
>>>>
>>>> Tejinder
>>>>
>>>>
>>>> On Fri, May 18, 2012 at 5:11 PM, Tejinder Aulakh <
>>>> tejinder@sharethis.com> wrote:
>>>>
>>>>> Thanks Arvind for your quick response. Will file a request
>>>>> for documentation.
>>>>>
>>>>> Tejinder
>>>>>
>>>>>
>>>>> On Fri, May 18, 2012 at 4:15 PM, Arvind Prabhakar <arvind@apache.org>wrote:
>>>>>
>>>>>> Hi Tejinder,
>>>>>> On Fri, May 18, 2012 at 4:01 PM, Tejinder Aulakh <
>>>>>> tejinder@sharethis.com> wrote:
>>>>>>
>>>>>>>
>>>>>>> 1)  I'm using Flume-NG and we have a large number of agents (log
>>>>>>> servers) sending log events to a collector server.
>>>>>>>
>>>>>>> How do I configure the agents so that in case this single collector
>>>>>>> goes down, agents start sending events to the backup collector?
I believe
>>>>>>> this will be the Failover Sink processor. Is that correct? I'm
not able to
>>>>>>> find any documentation on this configuration. Can someone please
provide
>>>>>>> the sample configuration for failover sink?
>>>>>>>
>>>>>>
>>>>>> From the Javadocs of the Failover Sink Processor:
>>>>>>
>>>>>>  * host1.sinkgroups = group1
>>>>>>  *
>>>>>>  * host1.sinkgroups.group1.sinks = sink1 sink2
>>>>>>   * host1.sinkgroups.group1.processor.type = failover
>>>>>>  * host1.sinkgroups.group1.processor.priority.sink1 = 5
>>>>>>  * host1.sinkgroups.group1.processor.priority.sink2 = 10
>>>>>>  * host1.sinkgroups.group1.processor.maxpenalty = 10000
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Flume NG User Guide is also missing this info.
>>>>>>> http://dl.dropbox.com/u/27523578/Flume/FlumeUserGuide.pdf
>>>>>>>
>>>>>>
>>>>>> Please help the project by reporting such things in the Jira.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> 2) What is the best way to determine the memory channel capacity?
Is
>>>>>>> there any way to monitor how many events are sitting in the channel
at any
>>>>>>> given time?
>>>>>>>
>>>>>>
>>>>>> Capacity c, Max Event Size n, Heap Size h, Extra overhead d should
>>>>>> all relate as:
>>>>>>
>>>>>>    c*n + d <= h
>>>>>>
>>>>>> For example, if your maximum event size is 2kb, and your heap space
>>>>>> is 20MB and you would like to reserve half of it for the memory overhead,
>>>>>> then the remaining 10MB will yield a capacity of 5000.
>>>>>>
>>>>>> Also, we are working on monitoring and should be available soon.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> flume.conf
>>>>>>> =======
>>>>>>> agent.channels = myMemoryChannel
>>>>>>> agent.sources = myExecSource
>>>>>>> agent.sinks = myCustomAvroSink
>>>>>>>
>>>>>>> # Define a memory channel called myMemoryChannel
>>>>>>> agent.channels.myMemoryChannel.type = memory
>>>>>>> agent.channels.myMemoryChannel.capacity = 1000000
>>>>>>> agent.channels.myMemoryChannel.transactionCapacity = 10000
>>>>>>> agent.channels.myMemoryChannel.keep-alive = 30
>>>>>>>
>>>>>>> # Define an exec source called myExecChannel to tail log file
>>>>>>> agent.sources.myExecSource.channels = myMemoryChannel
>>>>>>> agent.sources.myExecSource.type = exec
>>>>>>> agent.sources.myExecSource.command = tail -F /mnt/nginx/r.log
>>>>>>>
>>>>>>> # Define a custom avro sink called myCustomAvroSink
>>>>>>> agent.sinks.myCustomAvroSink.channel = myMemoryChannel
>>>>>>> agent.sinks.myCustomAvroSink.type = avro
>>>>>>> agent.sinks.myCustomAvroSink.hostname = {CollectorIP}.amazonaws.com
>>>>>>> agent.sinks.myCustomAvroSink.port = 45678
>>>>>>>
>>>>>>> TJ
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Thanks,
>>>>> TJ Aulakh
>>>>> Senior Software Engineer, ShareThis
>>>>> tejinder@sharethis.com
>>>>> Cell: (510)708-2499
>>>>>
>>>>> 4009 Miranda Avenue, Suite 200,
>>>>> Palo Alto CA 94304
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Thanks,
>>>> TJ Aulakh
>>>> Senior Software Engineer, ShareThis
>>>> tejinder@sharethis.com
>>>> Cell: (510)708-2499
>>>>
>>>> 4009 Miranda Avenue, Suite 200,
>>>> Palo Alto CA 94304
>>>>
>>>>
>>>
>>
>>
>> --
>>
>> Thanks,
>> TJ Aulakh
>> Senior Software Engineer, ShareThis
>> tejinder@sharethis.com
>> Cell: (510)708-2499
>>
>> 4009 Miranda Avenue, Suite 200,
>> Palo Alto CA 94304
>>
>>
>
>
> --
>
> Thanks,
> TJ Aulakh
> Senior Software Engineer, ShareThis
> tejinder@sharethis.com
> Cell: (510)708-2499
>
> 4009 Miranda Avenue, Suite 200,
> Palo Alto CA 94304
>
>

Mime
View raw message