Return-Path: X-Original-To: apmail-flume-user-archive@www.apache.org Delivered-To: apmail-flume-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BA9E710107 for ; Mon, 17 Jun 2013 16:16:33 +0000 (UTC) Received: (qmail 40636 invoked by uid 500); 17 Jun 2013 16:16:33 -0000 Delivered-To: apmail-flume-user-archive@flume.apache.org Received: (qmail 40602 invoked by uid 500); 17 Jun 2013 16:16:33 -0000 Mailing-List: contact user-help@flume.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flume.apache.org Delivered-To: mailing list user@flume.apache.org Received: (qmail 40594 invoked by uid 99); 17 Jun 2013 16:16:32 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Jun 2013 16:16:32 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of pchavez@verticalsearchworks.com designates 66.150.157.191 as permitted sender) Received: from [66.150.157.191] (HELO mail.verticalsearchworks.com) (66.150.157.191) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Jun 2013 16:16:26 +0000 Received: from CNVR-EXCH01.convera.com ([10.21.0.21]) by cnvr-exch01 ([10.21.0.21]) with mapi; Mon, 17 Jun 2013 09:16:04 -0700 From: Paul Chavez To: "user@flume.apache.org" Date: Mon, 17 Jun 2013 09:16:02 -0700 Subject: RE: Flume events rolling file too regularly Thread-Topic: Flume events rolling file too regularly Thread-Index: Ac5rYT6QrC7+hub/SXmBD50YT0wgGwAE4O4w Message-ID: References: <51BF1347.1040200@mydrivesolutions.com> In-Reply-To: <51BF1347.1040200@mydrivesolutions.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org There are three file roll defaults on the HDFS sink, rollInterval, rollSize= and rollCount. Their defaults are 30s, 1024B and 10 ecents, respectively. = You need to set each one as desired or disable them. My mail client chopped up your config, but I did a search and didn't see an= y of those properties set. I would start there. Hope that helps, Paul Chavez -----Original Message----- From: Josh Myers [mailto:josh.myers@mydrivesolutions.com]=20 Sent: Monday, June 17, 2013 6:47 AM To: user@flume.apache.org Subject: Flume events rolling file too regularly Hi guys, We are sending JSON events from our pipeline into a flume http source.=20 We have written a custom multiplexer and sink serializer. The events are be= ing routed into the correct channels and consumed OK by the sinks. The cust= om serializer takes a JSON event and outputs a csv. Files are being written= to s3 ( using s3n as hdfs ) but rather than appending the written csv file= , each event seems to be generating it own csv. The output is what I would = expect using rollCount 1, however we do occasionally get several events ( m= aybe 4 ) written per csv. Please see below for config. Ideally we want to use rollInterval of 24 hours, to generate a new .csv fil= e every 24 hours, but have events pretty quickly flushed to the csv file af= ter being sent. So one csv' per day that is consistently appended with what= ever events we throw in. We found however that with a rollInterval of 24 ho= urs the events weren't being flushed often enough... Any help would be hugely appreciated! Thanks. Josh