Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id ED1261824A for ; Sat, 27 Feb 2016 03:48:18 +0000 (UTC) Received: (qmail 19306 invoked by uid 500); 27 Feb 2016 03:48:18 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 19201 invoked by uid 500); 27 Feb 2016 03:48:18 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 19180 invoked by uid 99); 27 Feb 2016 03:48:18 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 27 Feb 2016 03:48:18 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 2D8FE2C1F58 for ; Sat, 27 Feb 2016 03:48:18 +0000 (UTC) Date: Sat, 27 Feb 2016 03:48:18 +0000 (UTC) From: "Daniel Templeton (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HDFS-9782) RollingFileSystemSink should have configurable roll interval MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-9782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated HDFS-9782: ----------------------------------- Attachment: HDFS-9782.005.patch Turns out that HADOOP-8608 doesn't actually help me. It adds the {{getTimeDuration()}} method to {{Configuration}}, but metrics are initialized with a {{SubsetConfiguration}} (from Apache Commons). Looks like I still have to do the parsing by hand. I have switched over to using {{TimeUnit}} for the conversions, though. After poking at trying to fake the clock, I came to a useful realization. The {{BPServiceActor}} tests are just trying to test the timing. I'm already doing that in the {{TestRollingFileSystemSink}} tests. What I'm trying to test in the test with the sleeps is whether the flush thread successfully flushes the logs, which can only be tested by actually scheduling it to run. With that in mind, I found a way to test that functionality with no sleeps in the common case. The sleeps are still there, just in case, but I've never seen it sleep even once. I also bumped the max sleep time in the test way up so that the chance of flakiness is approximately 0. I still need to do more manual testing, but first let's see if this passes muster. > RollingFileSystemSink should have configurable roll interval > ------------------------------------------------------------ > > Key: HDFS-9782 > URL: https://issues.apache.org/jira/browse/HDFS-9782 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Daniel Templeton > Assignee: Daniel Templeton > Attachments: HDFS-9782.001.patch, HDFS-9782.002.patch, HDFS-9782.003.patch, HDFS-9782.004.patch, HDFS-9782.005.patch > > > Right now it defaults to rolling at the top of every hour. Instead that interval should be configurable. The interval should also allow for some play so that all hosts don't try to flush their files simultaneously. > I'm filing this in HDFS because I suspect it will involve touching the HDFS tests. If it turns out not to, I'll move it into common instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)