Return-Path: Delivered-To: apmail-subversion-commits-archive@minotaur.apache.org Received: (qmail 29666 invoked from network); 27 Sep 2010 11:40:46 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 27 Sep 2010 11:40:46 -0000 Received: (qmail 34219 invoked by uid 500); 27 Sep 2010 11:40:46 -0000 Delivered-To: apmail-subversion-commits-archive@subversion.apache.org Received: (qmail 34121 invoked by uid 500); 27 Sep 2010 11:40:44 -0000 Mailing-List: contact commits-help@subversion.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@subversion.apache.org Delivered-To: mailing list commits@subversion.apache.org Received: (qmail 33978 invoked by uid 99); 27 Sep 2010 11:40:42 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Sep 2010 11:40:42 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Sep 2010 11:40:39 +0000 Received: by eris.apache.org (Postfix, from userid 65534) id 7C6D42388A1C; Mon, 27 Sep 2010 11:40:18 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r1001677 - /subversion/trunk/notes/wc-ng/nodes Date: Mon, 27 Sep 2010 11:40:18 -0000 To: commits@subversion.apache.org From: ehu@apache.org X-Mailer: svnmailer-1.0.8 Message-Id: <20100927114018.7C6D42388A1C@eris.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org Author: ehu Date: Mon Sep 27 11:40:18 2010 New Revision: 1001677 URL: http://svn.apache.org/viewvc?rev=1001677&view=rev Log: Add NODES design considerations document in nodes/wc-ng/nodes. Added: subversion/trunk/notes/wc-ng/nodes Added: subversion/trunk/notes/wc-ng/nodes URL: http://svn.apache.org/viewvc/subversion/trunk/notes/wc-ng/nodes?rev=1001677&view=auto ============================================================================== --- subversion/trunk/notes/wc-ng/nodes (added) +++ subversion/trunk/notes/wc-ng/nodes Mon Sep 27 11:40:18 2010 @@ -0,0 +1,159 @@ + +Description of the NODES table +============================== + + + * Introduction + * Inclusion of BASE nodes + * Rows to store state + * Ordering rows into layers + * Visibility of multiple op_depth rows + * Restructuring the tree means adding rows + * + + +Introduction +------------ + +The entire original design of wc-ng evolves around the notion that +there are a number of states in a working copy, each of which needs +to be managed. All operations - excluding merge - operate on three +trees: BASE, WORKING and ACTUAL. + +For an in-depth description of what each means, the reader is referred +to other documentation, also in the notes/ directory. In short, BASE +is what was checked out from the repository; WORKING includes +modifications mode with Subversion commands while ACTUAL also includes +changes which have been made with non-Subversion aware tools (rm, cp, etc.). + +The idea that there are three trees works - mostly. There is no need +for more trees outside the area of the metadata administration and even +then three trees got us pretty far. The problem starts when one realizes +tree modifications can be overlapping or layered. Imagine a tree with +a replaced subtree. It's possible to replace a subtree within the +replacement. Imagine that happened and that the user wants to revert +one of the replacements. Given a 'flat' system, with just enough columns +in the database to record the 'old' and 'new' information per node, a single +revert can be supported. However, in the example with the double +replacement above, that would mean it's impossible to revert one of the +two replacements: either there's not enough information in the deepest +replacement to execute the highest level replacement or vice versa +- depending on which information was selected to be stored in the "new" +columns. + +The NODES table is the answer to this problem: instead of having a single +row it a table with WORKING nodes with just enough columns to record +(as per the example) a replacement, the solution is to record different +states by having multiple rows. + + + +Inclusion of BASE nodes +----------------------- + +The original technical design of wc-ng included a WORKING_NODE and a +BASE_NODE table. As described in the introduction, the WORKING_NODE +table was replaced with NODES. However, the BASE_NODE table stores +roughly the same state information that WORKING_NODE did. Additionally, +in a number of situations, the system isn't interested in the type of +state it gets returned (BASE or WORKING) - it just wants the latest. + +As a result the BASE_NODE table has been integrated into the NODES +table. + +The main difference between the WORKING_NODE and BASE_NODE tables was +that the BASE_NODE table contained a few caching fields which are +not relevant to WORKING_NODE. Moving those to a separate table was +determined to be wasteful because the primary key of that table +whould be much larger than any information stored in it in the first +place. + + + +Rows to store state +------------------- + +Rows of the NODES table store state of nodes in the BASE tree +and the layers in the WORKING tree. Note that these nodes do not +need to exist in the working copy presented to the user: they may +be 'absent', 'not-present' or just removed (rm) without using +Subversion commands. + +A row contains information linking to the repository, if the node +was received from a repository. This reference may be a link to +the original nodes for copied or moved nodes, but for rows designating +BASE state, they refer to the repository location which was checked +out from. + +Additionally, the rows contain information about local modifications +such copy, move or delete operations. + + + +Ordering rows into layers +------------------------- + +Since the table might contain more than one row per (wc_id, local_relpath) +combination, an ordering mechanism needs to be added. To that effect +the 'op_depth' value has been devised. The op_depth is an integer +indicating the depth of the operation which modified the tree in order +for the node to enter the state indicated in the row. + +Every row for the (wc_id, local_relpath) combination must have a unique +op_depth associated with it. The value of op_depth is related to the +top-most node being modified in the given tree-restructuring +operation (operation root or oproot). E.g. upon deletion of a subtree, +all nodes in the subtree will have rows in the table with the same +op_depth. + +The op_depth is calculated by taking the number of path components in +the local_relpath of the oproot. The unmodified tree (BASE) is identified +by rows with an op_depth value 0. + +By having multiple restructuring operations on the same path in a modified +subtree (most notably replacements), the table may end up with multiple rows +with an op_depth bigger than 0. + + + +Visibility of multiple op_depth rows +------------------------------------ + +As stated in the introduction, there's no need to leak the concept of +multiple op_depth rows out of the meta data store - apart of the BASE +and WORKING trees. + +As described before, the BASE tree is defined by op_depth == 0. WORKING as +visible outside the metadata store maps back to those rows where +op_depth == MAX(op_depth) for each (wc_id, local_relpath) combination. + + + +Restructuring the tree means adding rows +---------------------------------------- + +The base idea behind the NODES table is that every tree restructuring +operation causes nodes to be added to the table in order to best support +the reversal process: in that case a revert simply means deletion of rows +and bringing the subtree back into sync with the metadata. + +There's one exception: When a delete is followed by a copy or move to +the deleted location - causing a replacement - a pre-existing (due to the +delete) row with the right op_depth exists and needs to be modified. On +revert, the modified nodes need to be restored to 'deleted' state, which +itself can be reverted during the next revert. + +### EHU: The statement above probably means that *all* nodes in the subtree + need to be rewritten: they all have a deleted state with the affected + op_depth, meaning they probably need a 'replaced/copied-to' state with + the same op_depth... + + + + + + +TODO: + * Explain the role of the 'deleted-below' columns + * Document states of the table and their meaning (including values + of the relevant columns) \ No newline at end of file