Return-Path: X-Original-To: apmail-flume-user-archive@www.apache.org Delivered-To: apmail-flume-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A38FD112AA for ; Thu, 3 Jul 2014 03:48:35 +0000 (UTC) Received: (qmail 62502 invoked by uid 500); 3 Jul 2014 03:48:34 -0000 Delivered-To: apmail-flume-user-archive@flume.apache.org Received: (qmail 62442 invoked by uid 500); 3 Jul 2014 03:48:34 -0000 Mailing-List: contact user-help@flume.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flume.apache.org Delivered-To: mailing list user@flume.apache.org Received: (qmail 62432 invoked by uid 99); 3 Jul 2014 03:48:33 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Jul 2014 03:48:33 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of hshreedharan@cloudera.com designates 209.85.220.176 as permitted sender) Received: from [209.85.220.176] (HELO mail-vc0-f176.google.com) (209.85.220.176) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Jul 2014 03:48:31 +0000 Received: by mail-vc0-f176.google.com with SMTP id ik5so10896841vcb.7 for ; Wed, 02 Jul 2014 20:48:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=/wx/dM7lWYcL+Ti2JIU66phOq7zUmMAA45Z3VWUIVrU=; b=AUR1fvh3faudBN3iq9ghhvIXcnenyYwIKWJAnxUorG1KLvyVextOeEJbtcnxm0vFgh QGSFRhLLNhrD7G2Avvv9C2Xd71VlTwABxhyg2xcJ95TDykTAYaS9plyzE3CPAlaUW66F o/Yj/KmSTiLMVxkHZoZMpgwF0OGYtGoqY2lN3DLS2spcmarlGWZnT5k8W/ia5ehtK+Ur vXqUr/iWd4g4jqkmaxY/gJCgBn6TLPwZNx4zkmoMhgd1YFY6D3nFlhLMwE/Qz8ejv+p8 X6mHPBJZTcus0UaK4bqHNoAOEaGqc/x4iSIps7KwdpGovFkkRd6bZ5IiYxQS4awr7UzN xX5g== X-Gm-Message-State: ALoCoQlBQoZybqH4KEkEFWpscZ79bhdFW/4f7i15bSDcyHbqg9eDJm3TfNw+CwUYz/JNfqySi5v9 X-Received: by 10.221.44.73 with SMTP id uf9mr1332173vcb.9.1404359286794; Wed, 02 Jul 2014 20:48:06 -0700 (PDT) MIME-Version: 1.0 Received: by 10.220.195.201 with HTTP; Wed, 2 Jul 2014 20:47:46 -0700 (PDT) In-Reply-To: References: <8B094F7E897248E89C8E55526178FD65@cloudera.com> From: Hari Shreedharan Date: Wed, 2 Jul 2014 20:47:46 -0700 Message-ID: Subject: Re: File Channel Backup Checkpoints are I/O Intensive To: "user@flume.apache.org" Content-Type: multipart/alternative; boundary=001a11337e426d3e1c04fd41e0aa X-Virus-Checked: Checked by ClamAV on apache.org --001a11337e426d3e1c04fd41e0aa Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Thanks a lot. I will take a look at this tomorrow or early next week. On Wed, Jul 2, 2014 at 5:29 PM, Abraham Fine wrote: > Hari- > > I added the new tests and created a new revision to my patch. > > > https://issues.apache.org/jira/secure/attachment/12653728/compress_backup= _checkpoint_new_tests.patch > > Thanks, > Abe > > -- > Abraham Fine | Software Engineer > (516) 567-2535 > BrightRoll, Inc. | Smart Video Advertising | www.brightroll.com > > > On Wed, Jul 2, 2014 at 4:32 PM, Hari Shreedharan < > hshreedharan@cloudera.com> wrote: > >> Hi Abraham, >> >> In general, the patch looks good. Can you add a couple of tests - >> * Original checkpoint is uncompressed, config changes to compress >> checkpoint - does the file channel restart from original checkpoint? are >> new checkpoints compressed? >> * Compressed checkpoint, config changes to not compress checkpoint - doe= s >> channel start up? are new checkpoints uncompressed? >> >> >> Hari >> >> >> On Wed, Jul 2, 2014 at 3:06 PM, Abraham Fine wrote= : >> >>> Hi Brock and Hari- >>> >>> I was just wondering if either of you had a chance to take a look at th= e >>> patch and if there is anything I can do to improve it. >>> >>> Thanks, >>> Abe >>> >>> -- >>> Abraham Fine | Software Engineer >>> (516) 567-2535 >>> BrightRoll, Inc. | Smart Video Advertising | www.brightroll.com >>> >>> >>> On Wed, Jun 11, 2014 at 6:48 PM, Brock Noland >>> wrote: >>> >>>> This is a great suggestion Abraham! >>>> >>>> >>>> On Wed, Jun 11, 2014 at 5:39 PM, Hari Shreedharan < >>>> hshreedharan@cloudera.com> wrote: >>>> >>>>> Thanks. I will review it :) >>>>> >>>>> >>>>> Thanks, >>>>> Hari >>>>> >>>>> On Wednesday, June 11, 2014 at 5:00 PM, Abraham Fine wrote: >>>>> >>>>> I went ahead and created a JIRA and patch: >>>>> https://issues.apache.org/jira/browse/FLUME-2401 >>>>> >>>>> The option is configurable with: >>>>> agentX.channels.ch1.compressBackupCheckpoint =3D true >>>>> >>>>> As per your recommendation, I used snappy-java. I also considered the >>>>> snappy and lz4 implementations in Hadoop IO but noticed that the >>>>> Hadoop IO dependency was removed in >>>>> https://issues.apache.org/jira/browse/FLUME-1285 >>>>> >>>>> Thanks, >>>>> Abe >>>>> -- >>>>> Abraham Fine | Software Engineer >>>>> (516) 567-2535 >>>>> BrightRoll, Inc. | Smart Video Advertising | www.brightroll.com >>>>> >>>>> >>>>> On Mon, Jun 9, 2014 at 4:01 PM, Hari Shreedharan >>>>> wrote: >>>>> >>>>> Hi Abraham, >>>>> >>>>> Compressing the backup checkpoint is very possible. Since the backup = is >>>>> rarely read (only if the original one is corrupt on restarts), is it >>>>> used. >>>>> So I think compressing it using something like Snappy would make sens= e >>>>> (GZIP >>>>> might hit performance). Can you try using snappy-java and see if that >>>>> gives >>>>> good perf and reasonable compression? >>>>> >>>>> Patches are always welcome. I=E2=80=99d be glad to review and commit = it. I >>>>> would >>>>> suggest making the compression optional via configuration so that >>>>> anyone >>>>> with smaller channels don=E2=80=99t end up using CPU for not much gai= n. >>>>> >>>>> >>>>> Thanks, >>>>> Hari >>>>> >>>>> On Monday, June 9, 2014 at 3:56 PM, Abraham Fine wrote: >>>>> >>>>> Hello- >>>>> >>>>> We are using Flume 1.4 with File Channel configured to use a very >>>>> large capacity. We keep the checkpoint and backup checkpoint on >>>>> separate disks. >>>>> >>>>> Normally the file channel is mostly empty (<<1% of capacity). For the >>>>> checkpoint the disk I/O seems to be very reasonable due to the usage >>>>> of a MappedByteBuffer. >>>>> >>>>> On the other hand, the backup checkpoint seems to be written to disk >>>>> in its entirety over and over again, resulting in very high disk >>>>> utilization. >>>>> >>>>> I noticed that, because the checkpoint file is mostly empty, it is >>>>> very compressible. I was able to GZIP our checkpoint from 381M to >>>>> 386K. I was wondering if it would be possible to always compress the >>>>> backup checkpoint before writing it to disk. >>>>> >>>>> I would be happy to work on a patch to implement this functionality i= f >>>>> there is interest. >>>>> >>>>> Thanks in Advance, >>>>> >>>>> -- >>>>> Abraham Fine | Software Engineer >>>>> (516) 567-2535 >>>>> BrightRoll, Inc. | Smart Video Advertising | www.brightroll.com >>>>> >>>>> >>>>> >>>> >>> >> > --001a11337e426d3e1c04fd41e0aa Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Thanks a lot. I will take a look at this tomorrow or early= next week.


On Wed, Jul 2, 2014 at 5:29 PM, Abraham Fine <abe@brightroll.com&= gt; wrote:
Hari-

I = added the new tests and created a new revision to my patch.

--= =C2=A0
Abraham Fine | Software Engineer
BrightRoll, Inc. | Smart Video Advertising | www.brightroll.com


On Wed, Jul= 2, 2014 at 4:32 PM, Hari Shreedharan <hshreedharan@cloudera.com> wrote:
Hi Abraham,

In general, the patch looks= good. Can you add a couple of tests -
* Original checkpoint is u= ncompressed, config changes to compress checkpoint - does the file channel = restart from original checkpoint? are new checkpoints compressed?
* Compressed checkpoint, config changes to not compress checkpoint - d= oes channel start up? are new checkpoints uncompressed?


Hari


On Wed, Jul 2, 2014 at 3:06 PM, Abraham Fine <abe@brightroll.com>= wrote:
Hi Brock and Hari-

I was just wondering= if either of you had a chance to take a look at the patch and if there is = anything I can do to improve it.

Thanks,
Abe

--=C2=A0
Abraham Fine | Software Engineer
(516) 567-2535
BrightRoll, Inc. | Smart Video Advertising | www.brightroll.com


On Wed, Jun 11, 2014 at = 6:48 PM, Brock Noland <brock@cloudera.com> wrote:
This is a great suggestion Abraham!


On Wed, Jun 11, 2014= at 5:39 PM, Hari Shreedharan <hshreedharan@cloudera.com> wrote:
Thanks. I will review it :)


Thanks,
Hari

=20

On Wednesday, June 11, 2014 at 5= :00 PM, Abraham Fine wrote:

I went ahead and created a JIRA an= d patch:

The option is configurable with:
agentX.chann= els.ch1.compressBackupCheckpoint =3D true

As per y= our recommendation, I used snappy-java. I also considered the
snappy and lz4 implementations in Hadoop IO but noticed that the
Hadoop IO dependency was removed in

Thanks,
Abe
--
Abraham Fine | Software Engineer
BrightRoll, Inc. | Smart Video Advertising | www.brightroll.com


On Mon, Jun 9, 2014 at 4:01 PM, Hari Shr= eedharan
Hi Abraham,

Compressing the backup check= point is very possible. Since the backup is
rarely read (only if = the original one is corrupt on restarts), is it used.
So I think = compressing it using something like Snappy would make sense (GZIP
might hit performance). Can you try using snappy-java and see if that = gives
good perf and reasonable compression?

<= div>Patches are always welcome. I=E2=80=99d be glad to review and commit it= . I would
suggest making the compression optional via configuration so that anyo= ne
with smaller channels don=E2=80=99t end up using CPU for not m= uch gain.


Thanks,
Hari

On Monday, June 9, 2014 at 3:56 PM, Abraham Fine wrote:

Hello-

We are using Flume 1.4= with File Channel configured to use a very
large capacity. We ke= ep the checkpoint and backup checkpoint on
separate disks.

Normally the file channel is = mostly empty (<<1% of capacity). For the
checkpoint the dis= k I/O seems to be very reasonable due to the usage
of a MappedByt= eBuffer.

On the other hand, the backup checkpoint seems to be wr= itten to disk
in its entirety over and over again, resulting in v= ery high disk
utilization.

I noticed tha= t, because the checkpoint file is mostly empty, it is
very compressible. I was able to GZIP our checkpoint from 381M to
386K. I was wondering if it would be possible to always compress the<= /div>
backup checkpoint before writing it to disk.

I would be happy to work on a patch to implement this functionality if=
there is interest.

Thanks in Advance,

--
Abraham Fine | Software Engineer
BrightRoll, Inc. | Smart Video Adverti= sing | www.brightro= ll.com
=20 =20 =20 =20
=20






--001a11337e426d3e1c04fd41e0aa--