flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hari Shreedharan" <hshreedha...@cloudera.com>
Subject Re: File channels creating many large files
Date Fri, 07 Nov 2014 17:59:12 GMT
Flume will leave at least 2 files per data directory. Once you have enough events to cause
2 files to be created, there will be at least 2 per dir. You can use maxFileSize parameter
to control the size of these files.



Thanks,
Hari

On Fri, Nov 7, 2014 at 10:25 AM, Jeff Lord <jlord@cloudera.com> wrote:

> Guy,
> What version of flume is this?
> -Jeff
> On Fri, Nov 7, 2014 at 1:19 AM, Needham, Guy <Guy.Needham@virginmedia.co.uk>
> wrote:
>>  Hi all,
>>
>> I have a configuration with a file channel configured such that:
>>
>> a1.channels.ch1.type = file
>> a1.channels.ch1.checkpointDir = /hadoop/user/flume/channels/checkpoint
>> a1.channels.ch1.dataDirs = /hadoop/user/flume/channels/data
>> a1.channels.ch1.capacity = 100000
>> a1.channels.ch1.transactionCapacity = 5000
>>
>> It's been running since October 28th with no issues, but when I looked
>> today in /hadoop/user/flume/channels/data I saw that the file channel was
>> building up large files which had been processed and not deleting them:
>>
>> [rdd@hadoop-kn-p2-m01 flume]$ ls -lh channels/data/
>> total 1.6G
>> -rw-r----- 1 rdd rdd 1.5G Oct 28 16:10 log-1
>> -rw-r----- 1 rdd rdd   47 Oct 28 16:10 log-1.meta
>> -rw-r----- 1 rdd rdd  72M Oct 31 16:28 log-2
>> -rw-r----- 1 rdd rdd   47 Oct 31 16:29 log-2.meta
>> It seems like for each day that data landed (we're still in testing so
>> data not landing constantly) a data file has been created but not deleted
>> when reading was completed.
>> Is this expected behaviour? Is there a way to stop large files building up
>> and still use the file channel?
>> Regards,
>> Guy Needham | Data Discovery
>> Virgin Media | Enterprise Data, Design & Management
>> Bartley Wood Business Park, Hook, Hampshire RG27 9UP
>> D 01256 75 3362
>> I welcome VSRE emails. Learn more at *http://vsre.info/*
>> <http://vsre.info/>
>>
>>
>>
>>
>> --------------------------------------------------------------------
>> Save Paper - Do you really need to print this e-mail?
>>
>> Visit www.virginmedia.com for more information, and more fun.
>>
>> This email and any attachments are or may be confidential and legally
>> privileged
>> and are sent solely for the attention of the addressee(s). If you have
>> received this
>> email in error, please delete it from your system: its use, disclosure or
>> copying is
>> unauthorised. Statements and opinions expressed in this email may not
>> represent
>> those of Virgin Media. Any representations or commitments in this email are
>> subject to contract.
>>
>> Registered office: Media House, Bartley Wood Business Park, Hook,
>> Hampshire, RG27 9UP
>> Registered in England and Wales with number 2591237
>>
Mime
View raw message