flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhiwen Sun <pens...@gmail.com>
Subject Re: Why used space of flie channel buffer directory increase?
Date Wed, 20 Mar 2013 07:15:09 GMT
Thanks for your reply.

I will try syslog as source.

Zhiwen Sun



On Wed, Mar 20, 2013 at 3:11 PM, Alexander Alten-Lorenz <wget.null@gmail.com
> wrote:

> HI,
>
> I suspect tail -F and nc for filling up the directory. Whats inside of
> such a file which grows without a event?
>
> My assumption:
> nc is open one stream, and deliver over this stream all incoming events.
> Flume doesn't know that no event is coming in, since the stream never
> breaks up. I wondering if you could use syslog(-ng) for the event delivery?
>
> Cheers,
>  Alex
>
>
>
> On Mar 20, 2013, at 2:30 AM, Zhiwen Sun <pensz01@gmail.com> wrote:
>
> > Thanks all for your reply.
> >
> > @Kenison
> > I stop my tail -F | nc program and there is no new event file in HDFS,
> so I think there is no event arrive. To make sure, I will test again with
> enable JMX.
> >
> > @Alex
> >
> > The latest log is following. I can't see any exception or warning.
> >
> > 13/03/19 15:28:16 INFO hdfs.BucketWriter: Renaming hdfs://
> 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490901.tmp to hdfs://
> 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490901
> > 13/03/19 15:28:16 INFO hdfs.BucketWriter: Creating hdfs://
> 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490902.tmp
> > 13/03/19 15:28:17 INFO file.EventQueueBackingStoreFile: Start checkpoint
> for /home/zhiwensun/.flume/file-channel/checkpoint/checkpoint, elements to
> sync = 3
> > 13/03/19 15:28:17 INFO file.EventQueueBackingStoreFile: Updating
> checkpoint metadata: logWriteOrderID: 1363659953997, queueSize: 0,
> queueHead: 362981
> > 13/03/19 15:28:17 INFO file.LogFileV3: Updating log-7.meta
> currentPosition = 216278208, logWriteOrderID = 1363659953997
> > 13/03/19 15:28:17 INFO file.Log: Updated checkpoint for file:
> /home/zhiwensun/.flume/file-channel/data/log-7 position: 216278208
> logWriteOrderID: 1363659953997
> > 13/03/19 15:28:26 INFO hdfs.BucketWriter: Renaming hdfs://
> 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490902.tmp to hdfs://
> 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490902
> > 13/03/19 15:28:27 INFO hdfs.BucketWriter: Creating hdfs://
> 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490903.tmp
> > 13/03/19 15:28:37 INFO hdfs.BucketWriter: Renaming hdfs://
> 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490903.tmp to hdfs://
> 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490903
> > 13/03/19 15:28:37 INFO hdfs.BucketWriter: Creating hdfs://
> 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490904.tmp
> >
> > 13/03/19 15:28:47 INFO file.EventQueueBackingStoreFile: Start checkpoint
> for /home/zhiwensun/.flume/file-channel/checkpoint/checkpoint, elements to
> sync = 2
> > 13/03/19 15:28:47 INFO file.EventQueueBackingStoreFile: Updating
> checkpoint metadata: logWriteOrderID: 1363659954200, queueSize: 0,
> queueHead: 362981
> > 13/03/19 15:28:47 INFO file.LogFileV3: Updating log-7.meta
> currentPosition = 216288815, logWriteOrderID = 1363659954200
> > 13/03/19 15:28:47 INFO file.Log: Updated checkpoint for file:
> /home/zhiwensun/.flume/file-channel/data/log-7 position: 216288815
> logWriteOrderID: 1363659954200
> > 13/03/19 15:28:48 INFO hdfs.BucketWriter: Renaming hdfs://
> 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490904.tmp to hdfs://
> 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490904
> >
> > @Hari
> > em, 12 hours passed. The size of file channel directory has no reduce.
> >
> > Files in file channel directory:
> >
> > -rw-r--r-- 1 zhiwensun zhiwensun    0 2013-03-19 09:15 in_use.lock
> > -rw-r--r-- 1 zhiwensun zhiwensun 1.0M 2013-03-19 10:11 log-6
> > -rw-r--r-- 1 zhiwensun zhiwensun   29 2013-03-19 10:12 log-6.meta
> > -rw-r--r-- 1 zhiwensun zhiwensun 207M 2013-03-19 15:28 log-7
> > -rw-r--r-- 1 zhiwensun zhiwensun   29 2013-03-19 15:28 log-7.meta
> > -rw-r--r-- 1 zhiwensun zhiwensun 207M 2013-03-19 15:28
> ./file-channel/data/log-7
> > -rw-r--r-- 1 zhiwensun zhiwensun 29 2013-03-19 10:12
> ./file-channel/data/log-6.meta
> > -rw-r--r-- 1 zhiwensun zhiwensun 29 2013-03-19 15:28
> ./file-channel/data/log-7.meta
> > -rw-r--r-- 1 zhiwensun zhiwensun 0 2013-03-19 09:15
> ./file-channel/data/in_use.lock
> > -rw-r--r-- 1 zhiwensun zhiwensun 1.0M 2013-03-19 10:11
> ./file-channel/data/log-6
> >
> >
> >
> >
> >
> > Zhiwen Sun
> >
> >
> >
> > On Wed, Mar 20, 2013 at 2:32 AM, Hari Shreedharan <
> hshreedharan@cloudera.com> wrote:
> > It is possible for the directory size to increase even if no writes are
> going in to the channel. If the channel size is non-zero and the sink is
> still writing events to HDFS, the takes get written to disk as well (so we
> know what events in the files were removed when the channel/agent
> restarts). Eventually the channel will clean up the files which have all
> events taken (though it will keep at least 2 files per data directory, just
> to be safe).
> >
> > --
> > Hari Shreedharan
> >
> > On Tuesday, March 19, 2013 at 10:32 AM, Alexander Alten-Lorenz wrote:
> >
> >> Hey,
> >>
> >> what says debug? Do you can gather logs and attach them?
> >>
> >> - Alex
> >>
> >> On Mar 19, 2013, at 5:27 PM, "Kenison, Matt" <Matt.Kenison@disney.com>
> wrote:
> >>
> >>> Check the JMX counter first, to make sure you really are not sending
> new events. If not, is it your checkpoint directory or data directory that
> is increasing in size?
> >>>
> >>>
> >>> From: Zhiwen Sun <pensz01@gmail.com>
> >>> Reply-To: "user@flume.apache.org" <user@flume.apache.org>
> >>> Date: Tue, 19 Mar 2013 01:19:19 -0700
> >>> To: "user@flume.apache.org" <user@flume.apache.org>
> >>> Subject: Why used space of flie channel buffer directory increase?
> >>>
> >>> hi all:
> >>>
> >>> I test flume-ng in my local machine. The data flow is :
> >>>
> >>> tail -F file | nc 127.0.0.01 4444 > flume agent > hdfs
> >>>
> >>> My configuration file is here :
> >>>
> >>>> a1.sources = r1
> >>>> a1.channels = c2
> >>>>
> >>>> a1.sources.r1.type = netcat
> >>>> a1.sources.r1.bind = 192.168.201.197
> >>>> a1.sources.r1.port = 44444
> >>>> a1.sources.r1.max-line-length = 1000000
> >>>>
> >>>> a1.sinks.k1.type = logger
> >>>>
> >>>> a1.channels.c1.type = memory
> >>>> a1.channels.c1.capacity = 10000
> >>>> a1.channels.c1.transactionCapacity = 10000
> >>>>
> >>>> a1.channels.c2.type = file
> >>>> a1.sources.r1.channels = c2
> >>>>
> >>>> a1.sources.r1.interceptors = i1
> >>>> a1.sources.r1.interceptors.i1.type = timestamp
> >>>>
> >>>> a1.sinks = k2
> >>>> a1.sinks.k2.type = hdfs
> >>>> a1.sinks.k2.channel = c2
> >>>> a1.sinks.k2.hdfs.path = hdfs://127.0.0.1:9000/flume/events/%Y-%m-%d
> >>>> a1.sinks.k2.hdfs.writeFormat = Text
> >>>> a1.sinks.k2.hdfs.rollInterval = 10
> >>>> a1.sinks.k2.hdfs.rollSize = 10000000
> >>>> a1.sinks.k2.hdfs.rollCount = 0
> >>>>
> >>>> a1.sinks.k2.hdfs.filePrefix = app
> >>>> a1.sinks.k2.hdfs.fileType = DataStream
> >>>
> >>>
> >>>
> >>> it seems that events were collected correctly.
> >>>
> >>> But there is a problem boring me: Used space of file channel
> (~/.flume) has always increased, even there is no new event.
> >>>
> >>> Is my configuration wrong or other problem?
> >>>
> >>> thanks.
> >>>
> >>>
> >>> Best regards.
> >>>
> >>> Zhiwen Sun
> >>
> >> --
> >> Alexander Alten-Lorenz
> >> http://mapredit.blogspot.com
> >> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
> >
> >
>
> --
> Alexander Alten-Lorenz
> http://mapredit.blogspot.com
> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>
>

Mime
View raw message