Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 12590 invoked from network); 18 Jun 2007 19:01:59 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 18 Jun 2007 19:01:59 -0000 Received: (qmail 53995 invoked by uid 500); 18 Jun 2007 19:01:57 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 53700 invoked by uid 500); 18 Jun 2007 19:01:57 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 53684 invoked by uid 99); 18 Jun 2007 19:01:57 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 Jun 2007 12:01:56 -0700 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 Jun 2007 12:01:52 -0700 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 465317141AA for ; Mon, 18 Jun 2007 12:01:32 -0700 (PDT) Message-ID: <7478516.1182193292283.JavaMail.jira@brutus> Date: Mon, 18 Jun 2007 12:01:32 -0700 (PDT) From: "Hairong Kuang (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-1003) Proposal to batch commits to edits log. In-Reply-To: <30325315.1170986945510.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-1003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12505923 ] Hairong Kuang commented on HADOOP-1003: --------------------------------------- +1 The patch looks good. > Proposal to batch commits to edits log. > --------------------------------------- > > Key: HADOOP-1003 > URL: https://issues.apache.org/jira/browse/HADOOP-1003 > Project: Hadoop > Issue Type: Improvement > Components: dfs > Reporter: Raghu Angadi > Assignee: dhruba borthakur > Attachments: editLogSync2.patch > > > Right now most expensive namenode operations are that require commits to edits log. e.g. creating a file, deleting, renaming etc. Most of the time is spent in fsync() of edits file (multiple fsync() calls in the case of multiple image directories). During this time whole namesystem is under lock and even non-mutating operations like open() are blocked. > On a local filesystem, each fsync could take in the order of milliseconds. My understanding is that guarantee namenode provides is that edits log is synced before replying to the client. Without any changes to current locking structure, I was thinking of the following for batching multiple edits : > a) a facility in RPC Server to postpone responding to a particular call (communication with ThreadLocals may be). This is strictly not required but without it, number operations batched would be limited to number of IPC threads. > b) Another Server thread that waits for pending commits to be synced and replies back to clients. > c) fsync manager that periodically syncs the edit log and informs waiting RPCs. The sync thread can dynamically decide to wait longer or shorter based on the load so that we don't increase the latency when namenode is lightly loaded. Event simple policy of 'sync if there are any mutations' will also work but that might reduce the hard disk life. > > All the synchronization between these threads is a bit complicated but it can be stable. My main concern is whether the guarantee we are providing enough for namenode operation. I think it is enough. > In terms of throughput, number of creates a namenode can do should be on the same range as number of opens it can do. > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.