Return-Path: X-Original-To: apmail-kafka-dev-archive@www.apache.org Delivered-To: apmail-kafka-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5E8AC10888 for ; Mon, 3 Mar 2014 17:41:33 +0000 (UTC) Received: (qmail 63739 invoked by uid 500); 3 Mar 2014 17:41:27 -0000 Delivered-To: apmail-kafka-dev-archive@kafka.apache.org Received: (qmail 63487 invoked by uid 500); 3 Mar 2014 17:41:22 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 63371 invoked by uid 99); 3 Mar 2014 17:41:22 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Mar 2014 17:41:22 +0000 Date: Mon, 3 Mar 2014 17:41:21 +0000 (UTC) From: "Jay Kreps (JIRA)" To: dev@kafka.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (KAFKA-615) Avoid fsync on log segment roll MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/KAFKA-615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Kreps updated KAFKA-615: ---------------------------- Issue Type: New Feature (was: Bug) > Avoid fsync on log segment roll > ------------------------------- > > Key: KAFKA-615 > URL: https://issues.apache.org/jira/browse/KAFKA-615 > Project: Kafka > Issue Type: New Feature > Reporter: Jay Kreps > Assignee: Jay Kreps > Fix For: 0.8.1 > > Attachments: KAFKA-615-v1.patch, KAFKA-615-v2.patch, KAFKA-615-v3.patch, KAFKA-615-v4.patch, KAFKA-615-v5.patch, KAFKA-615-v6.patch, KAFKA-615-v7.patch, KAFKA-615-v8.patch > > > It still isn't feasible to run without an application level fsync policy. This is a problem as fsync locks the file and tuning such a policy so that the flushes aren't so frequent that seeks reduce throughput, yet not so infrequent that the fsync is writing so much data that there is a noticable jump in latency is very challenging. > The remaining problem is the way that log recovery works. Our current policy is that if a clean shutdown occurs we do no recovery. If an unclean shutdown occurs we recovery the last segment of all logs. To make this correct we need to ensure that each segment is fsync'd before we create a new segment. Hence the fsync during roll. > Obviously if the fsync during roll is the only time fsync occurs then it will potentially write out the entire segment which for a 1GB segment at 50mb/sec might take many seconds. The goal of this JIRA is to eliminate this and make it possible to run with no application-level fsyncs at all, depending entirely on replication and background writeback for durability. -- This message was sent by Atlassian JIRA (v6.2#6252)