flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kishore alajangi <alajangikish...@gmail.com>
Subject Re: spooldir to hdfs
Date Thu, 26 Jun 2014 07:29:55 GMT
It is working with if i selected the source is avro and sink is hdfs,  the
configuration is below,

agent.sources = r1
agent.channels = c1
agnet.sinks = k1

agent.sources.r1.type = avro
agent.sources.r1.channels = c1
agent.sources.r1.bind = localhost
agent.sources.r1.port = 10001

agent.channels.c1.type = memory
agent.channels.c1.capacity = 1000
agent.channels.c1.transactionCapacity = 100
agent.sinks.k1.type = hdfs
agent.sinks.k1.hdfs.path = /flume
agent.sinks.k1.channel = c1

By running the below command i am starting the flume agent with the above
conf file

# flume-ng agent -c /etc/flume-ng/conf -f /etc/flume-ng/conf/flume.conf -n
agent -Dflume.root.logger = DEBUG,console

after the source created and started running and looking for the changes in
configuration file to read the content from the particular port, I am
running the below command to send the content to that port for flume source
to read,

# flume-ng avro-client -H localhost -p 10001 -F /<path of the avrofile>

Successfully able to write the content in hdfs to the given path /flume,

but my requirement is, each second in my directory there are huge amount of
new files will be stored, those all the files I need to store in hdfs when
ever the new files are generated, I think if i used "spooldir" as source
for flume agent it looks for new files and write into hdfs sink, but when i
tried with spooldir as a source the above error i am facing, please help me
to solve this problem.



On Thu, Jun 26, 2014 at 10:45 AM, kishore alajangi <
alajangikishore@gmail.com> wrote:

> yes sharninder, I tried with the below configuration, but it throws
> "flume.EventDeliveryException: Could not find schema for event" error,
> please help me
>
> first I copied the schema file to /etc/flume.conf/schemas(
> created this directory explicitly)/avro.avsc
>
> #defined source as spooldir, channel as memory, sink as hdfs,
> agent.sources = s1
> agent.channels = c1
> agent.sinks = r1
>
>
> agent.sources.s1.type = spooldir
> agent.sources.s1.spoolDir = <dir path>
> agent.sources.s1.deserializer = avro
> agent.sources.s1.deletePolicy = immediate
> agent.sources.s1.channels = c1
> agent.sources.s1.interceptors = attach-schema
> agent.sources.s1.interceptors.attach-schema = static
> agent.sources.s1.interceptors.attach-schema.key = flume.avro.schema.url
> agent.sources.s1.interceptors.attach-schema.value =
> flle:/etc/flume-ng/schemas/avro.avsc
>
> agent.channels.c1.type = memory
> agent.channels.c1.capacity = 10000000
> agent.channels.c1.transactionCapacity = 1000
>
> agent.sinks.s1.type = hdfs
> agent.sinks.s1.hdfs.path = /flume/
> agent.sinks.s1.hdfs.fileType = DataStream
> agent.sinks.s1.channel = c1
> agent.sinks.s1.hdfs.batchSize = 100
> agent.sinks.s1.hdfs.serializer = org.apache.flume.sink.hdfs.
> AvroEventSerializer$Builder
>
>
> On Wed, Jun 25, 2014 at 2:07 PM, Sharninder <sharninder@gmail.com> wrote:
>
>> Did you try using the spooldir source with an hdfs sink? What problems
>> did you face?
>>
>> --
>> Sharninder
>>
>>
>>
>> On Wed, Jun 25, 2014 at 12:15 PM, kishore alajangi <
>> alajangikishore@gmail.com> wrote:
>>
>>> Hi Flume Experts,
>>>
>>> Could anybody help me to store avro files located in my localfilesystem
>>> into hdfs using flume, frequently new files will be generated so "spooldir"
>>> work for it as source,
>>>
>>> --
>>> Thanks,
>>> Kishore.
>>>
>>
>>
>
>
> --
> Thanks,
> Kishore.
>



-- 
Thanks,
Kishore.

Mime
View raw message