Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 91905 invoked from network); 6 Nov 2006 22:25:22 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 6 Nov 2006 22:25:22 -0000 Received: (qmail 3391 invoked by uid 500); 6 Nov 2006 22:25:31 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 3363 invoked by uid 500); 6 Nov 2006 22:25:31 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 3323 invoked by uid 99); 6 Nov 2006 22:25:31 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Nov 2006 14:25:31 -0800 X-ASF-Spam-Status: No, hits=1.4 required=10.0 tests=DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_WHOIS X-Spam-Check-By: apache.org Received-SPF: neutral (herse.apache.org: local policy) Received: from [207.126.228.150] (HELO rsmtp2.corp.yahoo.com) (207.126.228.150) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Nov 2006 14:25:17 -0800 Received: from coatspeaklx (coatspeak-lx.corp.yahoo.com [10.72.110.26]) (authenticated bits=0) by rsmtp2.corp.yahoo.com (8.13.8/8.13.6/y.rout) with ESMTP id kA6MOrsK055420 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NO) for ; Mon, 6 Nov 2006 14:24:54 -0800 (PST) DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=from:to:subject:date:message-id:mime-version:content-type: content-transfer-encoding:x-mailer:in-reply-to:thread-index:x-mimeole; b=1NNGiXqqTbK6g9KD9YFP1stAx5X7Gyx5/SYsnBzFHW7qnIBfguu8jbgnhqYyUJET From: "Dhruba Borthakur" To: Subject: RE: [jira] Commented: (HADOOP-334) Redesign the dfs namespace datastructures to be copy on write Date: Mon, 6 Nov 2006 14:24:53 -0800 Message-ID: <016f01c701f2$61fc0c10$639115ac@ds.corp.yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 In-Reply-To: <28796411.1162849058917.JavaMail.jira@brutus> Thread-Index: AccB69z/BlwjgWSlTquSuuR17QkPCAABPDBg X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2962 X-Virus-Checked: Checked by ClamAV on apache.org Regarding copy-on-write approach, we do not need to traverse the entire namespace to reset the clone pointers at the end of the checkpointing process. We can keep a lookaside list that contains all the nodes that have a clone pointer. But we still have to acquire the global lock at the end of the checkpointing process, traverse this lookaside list of cloned-nodes, and then null-them. I like the generalized scheme of fine-grain locks (instead of a global lock) while traversing the namespace. It is more efficient once implemented correctly. There are quite a few tricks about lock-hierarchy that one has to play for "renames". But it can be done. The one thing that I am not clear about is whether we get correct semantics if the imagefile and the editfile overlap. If x, y and z are three transactions, are you saying that x + y + z is equilvalent to x + y + y +z where y is a single transaction that resides in the image file as well as the edits file. Are you proposing something like a global transaction number to identify duplicate transactions? -----Original Message----- From: Sameer Paranjpye (JIRA) [mailto:jira@apache.org] Sent: Monday, November 06, 2006 1:38 PM To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-334) Redesign the dfs namespace datastructures to be copy on write [ http://issues.apache.org/jira/browse/HADOOP-334?page=comments#action_1244754 2 ] Sameer Paranjpye commented on HADOOP-334: ----------------------------------------- Copy on write helps, but the global lock needs to be acquired at the end of the checkpointing process nevertheless. This still has the effect of locking clients out of the namespace while the entire namespace is traversed and the clone pointers are reset. Instead of copy on write, how about changing the locking model so that for any change: 1. Acquire read locks on all structures between the root and the change, acquire a write lock on the changed node. 2. To checkpoint, traverse the namespace acquiring read locks on the path between the root and the node being checkpointed. Serialize each node to a new image file on disk. This way we never lock down the whole tree, for any operation. At the start of the checkpointing process, a new edits file is created. Edits that occur while the checkpoint is in progress are sent to the new file. This implies that there will be some overlap between the checkpointed image and the edits file, but this is ok. We require that the union of the image and the edits give us the current state of the namespace but the two do not have to be disjoint. > Redesign the dfs namespace datastructures to be copy on write > ------------------------------------------------------------- > > Key: HADOOP-334 > URL: http://issues.apache.org/jira/browse/HADOOP-334 > Project: Hadoop > Issue Type: Improvement > Components: dfs > Affects Versions: 0.4.0 > Reporter: Owen O'Malley > Assigned To: Konstantin Shvachko > > The namespace datastructures should be copy on write so that the namespace does not need to be completely locked down from user changes while the checkpoint is being made. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira