flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Diamant <diamant.mich...@gmail.com>
Subject Re: Enabling file channel backup checkpoint causes significant disk IO at start-up
Date Mon, 08 Sep 2014 20:36:51 GMT
Hari, thank you for your quick reply.  A follow-up question to help me
figure out how best to proceed on my end:  Can you provide an estimate as
to when the next Flume release will occur?

On Mon, Sep 8, 2014 at 4:07 PM, Hari Shreedharan <hshreedharan@cloudera.com>

> This patch should address the issue, if enabled:
> https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commitdiff;h=69fd6b3ad5e5b9ae6f1293b3d8e57ed57fd6701c;hp=f15f20785262ac3cb3e35c2a12e669b7a836d35f
> It will be part of the next Flume release (or CDH5.2.0).
> --
> Thanks,
> Hari
>   Michael Diamant <diamant.michael@gmail.com>
>  September 8, 2014 at 12:58 PM
> My team uses Flume 1.4.0 packaged with CDH5.0.2 via an embedded agent to
> write to a file channel.  From a previous thread started by my colleague,
> "FileChannel Replays consistently take a long time" and associated issue,
> https://issues.apache.org/jira/browse/FLUME-2450, it was suggested to use
> a backup checkpoint directory to avoid lengthy replays.  When I enabled the
> backup checkpoint directory, I observed via iotop near 100% IO by my
> application with the embedded agent.  This level of IO persists for about
> 30 seconds rendering the application unusable during this time period.
> For comparison, I monitored via iotop when backup checkpoint is disabled.
>  IO activity occurs for at most several seconds.  That is, there is a
> qualitative difference when enabling the backup checkpoint directory.
>  Additionally, I also tried deleting the existing checkpoints/data
> directories to start with a clean slate.  Those experiment results are
> in-line with my above observations.
> Is this expected behavior when using a backup checkpoint directory?  Is
> there anyway in which the amount of IO can be reduced?  I appreciate
> feedback and insights because the current behavior is untenable for a
> production environment.
> Thank you,
> Michael

View raw message