flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From christopher palm <cpa...@gmail.com>
Subject Re: Static Interceptors Not Working
Date Wed, 17 Sep 2014 12:20:19 GMT
Ashish,
Thanks for responding.
What I was doing to verify, was by looking at the file once it was in HDFS.
The goal being I need different flume clients that need to append host and
static interceptor information  in their header, so that when they stream
to the same hdfs destination, I can figure out where the line entry came
from.
The reason I want to combine from several log locations, has largely to do
with pulling together a larger file for hadoop to process vs many smaller
files.

Does anyone one know if there is a way to use the static/host interceptor
from across an avro stream into an HDFS sink?



On Tue, Sep 16, 2014 at 11:51 PM, Ashish <paliwalashish@gmail.com> wrote:

> How are you verifying the data?
>
> When you are using Avro Sink, data shall be sent to Avro Source, here the
> config defined serializer (Header and Text) is not used.
>
> client.sinks.ki.sink.serializer = HEADER_AND_TEXT has no effect
>
> Not sure about HDFS sink.
>
> Lets see if someone else can help here. Anyone?
>
> On Wed, Sep 17, 2014 at 9:29 AM, chris <cpalm3@gmail.com> wrote:
>
>>  Thanks that worked for the config I have below.
>> Now I change this to stream to an avro sink it stopped working.
>> I even tried adding it to the hdfs conf,but it doesn't produce the static
>> interceptors.
>>
>> Any ideas on why it works as a file_roll sync but not avro sink?
>>
>>
>> Client.config
>> client.channels=ch1
>> client.channels.ch1.type=memory
>> client.channels.ch1.capacity=100000
>> client.channels.ch1.transactionCapacity=1000000000000
>>
>> client.sources=src-1
>> client.sources.src-1.type=spooldir
>> client.sources.src-1.spoolDir=/root/unpack
>> client.sources.src-1.deserializer.maxLineLength=10000
>> client.sources.src-1.interceptors = i2 i1
>> client.sources.src-1.interceptors.i1.type = host
>> client.sources.src-1.interceptors.i1.hostHeader = hostname
>> #client.sources.src-1.interceptors.i1.useIP = true
>> client.sources.src-1.interceptors.i2.type = static
>> client.sources.src-1.interceptors.i2.key = environment
>> client.sources.src-1.interceptors.i2.value = sqa
>> client.sinks=k1
>> client.sinks.k1.type=avro
>> client.sinks.k1.hostname=localhost
>> client.sinks.k1.port=42424
>> client.sinks.ki.sink.serializer = HEADER_AND_TEXT
>> ## Debugging Sink, Comment out AvroSink if you use this one
>> # http://flume.apache.org/FlumeUserGuide.html#file-roll-sink
>> #client.sinks.k1.type = file_roll
>> #client.sinks.k1.sink.directory = /root/sink
>> #client.sinks.k1.sink.rollInterval = 0
>> #client.sinks.k1.sink.serializer = HEADER_AND_TEXT
>>
>> # Connect soure and sink with channel
>> client.sources.src-1.channels=ch1
>> client.sinks.k1.channel=ch1
>>
>> HDFS  conf
>> collector.sources=av1
>> collector.sources.av1.interceptors = i2
>> collector.sources.av1.interceptors.i2.type = timestamp
>> collector.sources.av1.type=avro
>> collector.sources.av1.bind=0.0.0.0
>> collector.sources.av1.port=42424
>> collector.sources.av1.channels=ch1
>> collector.channels=ch1
>> collector.channels.ch1.type=memory
>> collector.channels.ch1.capacity = 100000
>> collector.channels.ch1.transactionCapacity = 1000000000000
>> collector.sinks=k1
>> collector.sinks.k1.type=hdfs
>> collector.sinks.k1.channel=ch1
>> collector.sinks.k1.hdfs.path=/flume/%y%m%d
>> collector.sinks.k1.hdfs.fileType = DataStream
>> collector.sinks.k1.hdfs.rollInterval = 86400
>> collector.sinks.k1.hdfs.rollSize = 0
>> collector.sinks.k1.hdfs.rollCount = 0
>> collector.sinks.k1.hdfs.serializer = HEADER_AND_TEXT
>>
>>
>>
>> On 9/16/14 11:10 AM, Ashish wrote:
>>
>>  Try using HEADER_AND_TEXT as serializer for sink, default is Text
>> Serializer that writes only the Event body.
>>
>> On Tue, Sep 16, 2014 at 7:31 PM, christopher palm <cpalm3@gmail.com> <cpalm3@gmail.com>
wrote:
>>
>> > All,
>> >
>> > I am trying to get the static interceptor to insert key, value information
>> > in each line that is written
>> > out in my data sink.
>> > I have tried this with various configurations, but can't seem to get any
>> > output from the interceptor
>> > to show up in the output files produced by Flume in the target data
>> > directory.
>> > Below is my latest config, using spooldir as the source and a file_roll
>> > sink as the output.
>> >
>> > Any suggestions as to what I am configuring wrong here?
>> >
>> > Thanks,
>> > Chris
>> >
>> > client.channels=ch1
>> > client.channels.ch1.type=memory
>> > client.channels.ch1.capacity=100000
>> > client.channels.ch1.transactionCapacity=100000
>> >
>> > client.sources=src-1
>> > client.sources.src-1.type=spooldir
>> > client.sources.src-1.spoolDir=/opt/app/solr/flume/sinkIn
>> > client.sources.src-1.deserializer.maxLineLength=10000
>> > client.sources.src-1.interceptors = i1
>> > client.sources.src-1.interceptors.i1.type = static
>> > client.sources.src-1.interceptors.i1.preserveExisting = false
>> > client.sources.src-1.interceptors.i1.key = datacenter
>> > client.sources.src-1.interceptors.i1.value= NYC_01
>> > client.sinks=k1
>> > #client.sinks.k1.type=avro
>> > #client.sinks.k1.hostname=localhost
>> > #client.sinks.k1.port=42424
>> > ## Debugging Sink, Comment out AvroSink if you use this one
>> > # http://flume.apache.org/FlumeUserGuide.html#file-roll-sink
>> > client.sinks.k1.type = file_roll
>> > client.sinks.k1.sink.directory = /opt/app/solr/flume/sinkOut
>> > client.sinks.k1.sink.rollInterval = 0
>> >
>> > # Connect soure and sink with channel
>> > client.sources.src-1.channels=ch1
>> > client.sinks.k1.channel=ch1
>> >
>>
>>
>>
>> --
>> thanks
>> ashish
>>
>> Blog: http://www.ashishpaliwal.com/blog
>> My Photo Galleries: http://www.pbase.com/ashishpaliwal
>>
>>
>>
>
>
> --
> thanks
> ashish
>
> Blog: http://www.ashishpaliwal.com/blog
> My Photo Galleries: http://www.pbase.com/ashishpaliwal
>

Mime
View raw message