flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keane, Mike" <mke...@conversantmedia.com>
Subject RE: Spooldir needs a Kafka topic defined in the agent.conf
Date Thu, 07 Jan 2016 14:03:54 GMT
I assume you want to the topic header based on the contents of the line of data read in from
the Spooling Directory Source?  If so I think you want to configure a Regex Extract Interceptor,
or implement your own interceptor to do this.  

http://flume.apache.org/FlumeUserGuide.html#regex-extractor-interceptor


________________________________________
From: Simone Roselli [simone.roselli@plista.com]
Sent: Thursday, January 07, 2016 6:52 AM
To: user
Subject: Re: Spooldir needs a Kafka topic defined in the agent.conf

Hi,

in your configuration you define the topic name in the agent.conf (spoolingAgent.sinks.kafka-sink-1.topic
= data_in);

This is what I do not want.

I would like the spoolDir to retrieve the topic name from the event headers


Simone Roselli
ITE Sysadmin
simone.roselli@plista.com
http://www.plista.com

----- Original Message -----
From: "Keane, Mike" <mkeane@conversantmedia.com>
To: "user" <user@flume.apache.org>
Sent: Wednesday, January 6, 2016 6:30:41 PM
Subject: RE: Spooldir needs a Kafka topic defined in the agent.conf

I attempted to put together a little Flume+Kafka tutorial including using Camus to run map-reduce
jobs pulling from Kafka and writing to HDFS.  My example uses a spoolDirSource, KafkaChannel
& KafkaSink.  This may be of some help to you.

https://github.com/mbkeane/BigDataTechCon/blob/master/README.md



________________________________________
From: Simone Roselli [simone.roselli@plista.com]
Sent: Wednesday, January 06, 2016 10:33 AM
To: user@flume.apache.org
Subject: Spooldir needs a Kafka topic defined in the agent.conf

Hi,

I'm having trouble configuring a spooldir source using the Kafka sink

In Flume-NG I can use the Kafka sink without specify a topic name in the agent.conf, since
the event contains this topic name in the headers.

Things look different using the spooldir source. If you don't provide a topic name in agent.conf,
it will only try a default one (default-flume-topic).

Is there a way to force spooldir source using the topic name in the headers?

ps: I'm using Spooldir with the AVRO deserialization; no other particular configuration. "fileHeader"
is set as "true"


Many thanks


Simone Roselli
ITE Sysadmin
simone.roselli@plista.com
http://www.plista.com




This email and any files included with it may contain privileged,
proprietary and/or confidential information that is for the sole use
of the intended recipient(s).  Any disclosure, copying, distribution,
posting, or use of the information contained in or attached to this
email is prohibited unless permitted by the sender.  If you have
received this email in error, please immediately notify the sender
via return email, telephone, or fax and destroy this original transmission
and its included files without reading or saving it in any manner.
Thank you.




This email and any files included with it may contain privileged,
proprietary and/or confidential information that is for the sole use
of the intended recipient(s).  Any disclosure, copying, distribution,
posting, or use of the information contained in or attached to this
email is prohibited unless permitted by the sender.  If you have
received this email in error, please immediately notify the sender
via return email, telephone, or fax and destroy this original transmission
and its included files without reading or saving it in any manner.
Thank you.


Mime
View raw message