flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yogendra reddy <yogendra...@gmail.com>
Subject Re: Flume Topology
Date Tue, 01 Dec 2015 10:15:19 GMT
Thanks for the clarification.

On Fri, Nov 27, 2015 at 2:15 PM, Gonzalo Herreros <gherreros@gmail.com>
wrote:

> Yes, the best way to consolidate multiple sources is to use an avro sinks
> that forwards to the agent that writes to hdfs (which exposes an avro
> source to listen to the other avro sinks).
>
>
> On 27 November 2015 at 08:28, zaenal rifai <togatta.fudo@gmail.com> wrote:
>
>> sorry, i mean avro sink
>>
>>
>>
>>
>> On 27 November 2015 at 14:52, Gonzalo Herreros <gherreros@gmail.com>
>> wrote:
>>
>>> Hi Zaenal,
>>>
>>> There is no "avro channel", Flume will write by default avro to any of
>>> the channels.
>>> The point is that a memory channel or even a file channel will very
>>> quickly fill up because a single sink cannot keep up with the many sources.
>>>
>>> Regards,
>>> Gonzalo
>>>
>>> On 27 November 2015 at 03:43, zaenal rifai <togatta.fudo@gmail.com>
>>> wrote:
>>>
>>>> why not to use avro channel gonzalo ?
>>>>
>>>> On 26 November 2015 at 20:12, Gonzalo Herreros <gherreros@gmail.com>
>>>> wrote:
>>>>
>>>>> You cannot have multiple processes writing concurrently to the same
>>>>> hdfs file.
>>>>> What you can do is have a topology where many agents forward to an
>>>>> agent that writes to hdfs but you need a channel that allows the single
>>>>> hdfs writer to lag behind without slowing the sources.
>>>>> A kafka channel might be a good choice.
>>>>>
>>>>> Regards,
>>>>> Gonzalo
>>>>>
>>>>> On 26 November 2015 at 11:57, yogendra reddy <yogendra.60@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi All,
>>>>>>
>>>>>> Here's my current flume setup for a hadoop cluster to collect service
>>>>>> logs
>>>>>>
>>>>>> - Run flume agent in each of the nodes
>>>>>> - Configure flume sink to write to hdfs and the files end up in this
>>>>>> way
>>>>>>
>>>>>> ..flume/events/node0logfile
>>>>>> ..flume/events/node1logfile
>>>>>>
>>>>>> ..flume/events/nodeNlogfile
>>>>>>
>>>>>> But I want to be able to write all the logs from multiple agents
to a
>>>>>> single file in hdfs . How can I achieve this and what would the topology
>>>>>> look like.
>>>>>> can this be done via collector ? If yes, where can I run the
>>>>>> collector and how will this scale for a 1000+ node  cluster.
>>>>>>
>>>>>> Thanks,
>>>>>> Yogendra
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message