flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Karlson <jeremykarl...@gmail.com>
Subject Re: Flume Data Directory Cleanup
Date Thu, 18 Jul 2013 17:06:41 GMT
To follow up:

My Flume agent ran out of disk space last night and appeared to stop
processing.  I shut it down and as an experiment (it's a test machine, why
not?) I deleted the oldest 10 data files, to see if Flume actually needed
these when it restarted.

Flume was not happy with my choices.

It spit out a lot of this:

2013-07-18 00:00:00,013 ERROR [pool-40-thread-1]        o.a.f.s.AvroSource
Avro source mySource: Unable to process event batch. Exception follows.
java.lang.IllegalStateException: Channel closed [channel=myFileChannel].
Due to java.lang.NullPointerException: null
Caused by: java.lang.NullPointerException
        at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:895)
        at org.apache.flume.channel.file.Log.replay(Log.java:406)

So it seems like these files were actually in use, and not just leftover
cruft.  A worthwhile thing to know, but I'd like to understand why.  My
events are probably at most 1k of text, so it seems kind of odd to me that
they'd consume more than 50GB of disk space in the channel.

-- Jeremy

On Wed, Jul 17, 2013 at 3:24 PM, Jeremy Karlson <jeremykarlson@gmail.com>wrote:

> Hi All,
> I have a very busy channel that has about 100,000 events queued up.  My
> data directory has about 50 data files, each about 1.6 GB.  I don't believe
> my 100k events could be consuming that much space, so I'm jumping to
> conclusions and assuming that most of these files are old and due for
> cleanup (but I suppose it's possible).  I'm not finding much guidance in
> the user guide on how often these files are cleaned up / removed /
> compacted / etc.
> Any thoughts on what's going on here, or what settings I should look for?
>  Thanks.
> -- Jeremy

View raw message