flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Lord <jl...@cloudera.com>
Subject Re: Flume log event per file
Date Thu, 27 Feb 2014 17:43:39 GMT
It looks like you have not configured any properties for "rolling"
files on hdfs.
The default rollCount is 10 (events).

http://flume.apache.org/FlumeUserGuide.html#hdfs-sink

The flume hdfs sink can be configured to roll based on size, # of
events, or time.

hdfs.rollInterval30Number of seconds to wait before rolling current
file (0 = never roll based on time interval)
hdfs.rollSize1024File size to trigger roll, in bytes (0: never roll
based on file size)
hdfs.rollCount10Number of events written to file before it rolled (0 =
never roll based on number of events)

On Thu, Feb 27, 2014 at 7:49 AM, orahad bigdata <oraclehad@gmail.com> wrote:
> Hi All,
>
> I'm new in flume , I have a small Hadoop setup and flume agent on that,I'm
> using tail -f logfilename file as a source.
>
> When I started the agent it is ingesting data into hdfs, but each file only
> contains 10 lines can we configure the number of line per file on hdfs?
>
> below is my agent conf file.
>
>
> agent.sources = pstream
> agent.channels = memoryChannel
> agent.channels.memoryChannel.type = memory
> agent.channels.memoryChannel.capacity = 100000
> agent.channels.memoryChannel.transactionCapacity = 10000
> agent.sources.pstream.channels = memoryChannel
> agent.sources.pstream.type = exec
> agent.sources.pstream.command = tail -f /root/dummylog
> agent.sources.pstream.batchSize=1000
> agent.sinks = hdfsSink
> agent.sinks.hdfsSink.type = hdfs
> agent.sinks.hdfsSink.channel = memoryChannel
> agent.sinks.hdfsSink.hdfs.path = hdfs://xxxxx:xxx/somepath
> agent.sinks.hdfsSink.hdfs.fileType = DataStream
> agent.sinks.hdfsSink.hdfs.writeFormat = Text
>
>
> Thanks

Mime
View raw message