flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hari Shreedharan" <hshreedha...@cloudera.com>
Subject RE: File channels creating many large files
Date Mon, 10 Nov 2014 19:29:56 GMT
Flume removes them when there are no more events to be read from them (though once 2 or more
files are created, there will be a minimum of 2 files - that is just a safety net).


Thanks,
Hari

On Mon, Nov 10, 2014 at 1:23 AM, Needham, Guy
<Guy.Needham@virginmedia.co.uk> wrote:

> Is there a concept of a data file which is 'done'? Does Flume remove data files it no
longer needs, or will these build up?
> Regards,
> Guy Needham | Data Discovery
> Virgin Media | Enterprise Data, Design & Management
> Bartley Wood Business Park, Hook, Hampshire RG27 9UP
> D 01256 75 3362
> I welcome VSRE emails. Learn more at http://vsre.info/
> ________________________________
> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
> Sent: 10 November 2014 09:15
> To: user@flume.apache.org
> Cc: user@flume.apache.org
> Subject: RE: File channels creating many large files
> That value is in bytes. At 500k, you will likely end up with too many files. You should
set it as high as you can.
> Thanks, Hari
> On Mon, Nov 10, 2014 at 1:05 AM, Needham, Guy <Guy.Needham@virginmedia.co.uk<mailto:Guy.Needham@virginmedia.co.uk>>
wrote:
> Hari, Jeff,
> thanks for your replies. It's Flume 1.5.0, I'll use the maxFileSize parameter to fix
this. Is there any impact on channel optimisation from setting it to say 500000?
> Regards,
> Guy Needham | Data Discovery
> Virgin Media | Enterprise Data, Design & Management
> Bartley Wood Business Park, Hook, Hampshire RG27 9UP
> D 01256 75 3362
> I welcome VSRE emails. Learn more at http://vsre.info/
> ________________________________
> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
> Sent: 07 November 2014 17:59
> To: user@flume.apache.org
> Cc: user@flume.apache.org
> Subject: Re: File channels creating many large files
> Flume will leave at least 2 files per data directory. Once you have enough events to
cause 2 files to be created, there will be at least 2 per dir. You can use maxFileSize parameter
to control the size of these files.
> Thanks, Hari
> On Fri, Nov 7, 2014 at 10:25 AM, Jeff Lord <jlord@cloudera.com<mailto:jlord@cloudera.com>>
wrote:
> Guy,
> What version of flume is this?
> -Jeff
> On Fri, Nov 7, 2014 at 1:19 AM, Needham, Guy <Guy.Needham@virginmedia.co.uk<mailto:Guy.Needham@virginmedia.co.uk>>
wrote:
> Hi all,
> I have a configuration with a file channel configured such that:
> a1.channels.ch1.type = file
> a1.channels.ch1.checkpointDir = /hadoop/user/flume/channels/checkpoint
> a1.channels.ch1.dataDirs = /hadoop/user/flume/channels/data
> a1.channels.ch1.capacity = 100000
> a1.channels.ch1.transactionCapacity = 5000
> It's been running since October 28th with no issues, but when I looked today in /hadoop/user/flume/channels/data
I saw that the file channel was building up large files which had been processed and not deleting
them:
> [rdd@hadoop-kn-p2-m01 flume]$ ls -lh channels/data/
> total 1.6G
> -rw-r----- 1 rdd rdd 1.5G Oct 28 16:10 log-1
> -rw-r----- 1 rdd rdd   47 Oct 28 16:10 log-1.meta
> -rw-r----- 1 rdd rdd  72M Oct 31 16:28 log-2
> -rw-r----- 1 rdd rdd   47 Oct 31 16:29 log-2.meta
> It seems like for each day that data landed (we're still in testing so data not landing
constantly) a data file has been created but not deleted when reading was completed.
> Is this expected behaviour? Is there a way to stop large files building up and still
use the file channel?
> Regards,
> Guy Needham | Data Discovery
> Virgin Media | Enterprise Data, Design & Management
> Bartley Wood Business Park, Hook, Hampshire RG27 9UP
> D 01256 75 3362
> I welcome VSRE emails. Learn more at http://vsre.info/
> --------------------------------------------------------------------
> Save Paper - Do you really need to print this e-mail?
> Visit www.virginmedia.com<http://www.virginmedia.com> for more information, and
more fun.
> This email and any attachments are or may be confidential and legally privileged
> and are sent solely for the attention of the addressee(s). If you have received this
> email in error, please delete it from your system: its use, disclosure or copying is
> unauthorised. Statements and opinions expressed in this email may not represent
> those of Virgin Media. Any representations or commitments in this email are
> subject to contract.
> Registered office: Media House, Bartley Wood Business Park, Hook, Hampshire, RG27 9UP
> Registered in England and Wales with number 2591237
> --------------------------------------------------------------------
> Save Paper - Do you really need to print this e-mail?
> Visit www.virginmedia.com for more information, and more fun.
> This email and any attachments are or may be confidential and legally privileged
> and are sent solely for the attention of the addressee(s). If you have received this
> email in error, please delete it from your system: its use, disclosure or copying is
> unauthorised. Statements and opinions expressed in this email may not represent
> those of Virgin Media. Any representations or commitments in this email are
> subject to contract.
> Registered office: Media House, Bartley Wood Business Park, Hook, Hampshire, RG27 9UP
> Registered in England and Wales with number 2591237
> --------------------------------------------------------------------
> Save Paper - Do you really need to print this e-mail?
> Visit www.virginmedia.com for more information, and more fun.
> This email and any attachments are or may be confidential and legally privileged
> and are sent solely for the attention of the addressee(s). If you have received this
> email in error, please delete it from your system: its use, disclosure or copying is
> unauthorised. Statements and opinions expressed in this email may not represent
> those of Virgin Media. Any representations or commitments in this email are
> subject to contract. 
> Registered office: Media House, Bartley Wood Business Park, Hook, Hampshire, RG27 9UP
> Registered in England and Wales with number 2591237
Mime
View raw message