Return-Path: X-Original-To: apmail-subversion-dev-archive@minotaur.apache.org Delivered-To: apmail-subversion-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 99DAB182EF for ; Tue, 9 Feb 2016 00:45:58 +0000 (UTC) Received: (qmail 57999 invoked by uid 500); 9 Feb 2016 00:45:58 -0000 Delivered-To: apmail-subversion-dev-archive@subversion.apache.org Received: (qmail 57939 invoked by uid 500); 9 Feb 2016 00:45:58 -0000 Mailing-List: contact dev-help@subversion.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list dev@subversion.apache.org Received: (qmail 57927 invoked by uid 99); 9 Feb 2016 00:45:58 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Feb 2016 00:45:58 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id BE909C06C9 for ; Tue, 9 Feb 2016 00:45:57 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.821 X-Spam-Level: X-Spam-Status: No, score=-0.821 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=wandisco.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id D3rDAe7RFYzi for ; Tue, 9 Feb 2016 00:45:56 +0000 (UTC) Received: from mail-wm0-f49.google.com (mail-wm0-f49.google.com [74.125.82.49]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id BD5D42050D for ; Tue, 9 Feb 2016 00:45:55 +0000 (UTC) Received: by mail-wm0-f49.google.com with SMTP id p63so137439622wmp.1 for ; Mon, 08 Feb 2016 16:45:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wandisco.com; s=gapps; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version:content-type; bh=8fey5G9d4bBkOS9yax/WFASb/tBlKqfhXVFQ4cSX52M=; b=OBvF+Shc3ez6N+q0qOb6uH6bXyGkJPjmbC4llurwW2qDCMmCcVLA+/cmg32hZdI/+M +xKwfsdeDTKsO7WiQ5cmXVcykDKBS1tCv1ksCL+GFvvfqV596u4Lhixuc+zEs1dA0o/b oakCFlOzNAXFZCKts3hsK1hCxjaStF+5KptNM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version:content-type; bh=8fey5G9d4bBkOS9yax/WFASb/tBlKqfhXVFQ4cSX52M=; b=G4F0UWf5cvL9u//YiCD915nYmXxCBiUs8tjyTjlZHs2UmLeC5aunCL4zeJvaSDBN1N thHHW2SVOJVuFjNeTCgLd2SyarwW0RF4LMzJEZKBNSUAFbocRCQiU6WIP7I0ntqTOdq+ 6Z1hFl2s9/Wc5bmaoIMQy1ePGCtusoo/ecCmKYI6ZhULjfrLNpYyo1zVLJ4xr+o5gMRr RSafoHgxfHMdGOVfypgZwCJKNBqiq1r+eMqw5zq+Fd1hnEpv+nlUYhvDHhtIfiXDkpJY ciZZ8JHLi2YzGiVJa0lubdF91gwFh0Bb2GQmdqT77rTYhIMfLwd64Nk1g0S5Yb8xCB65 ykiw== X-Gm-Message-State: AG10YOR/ON6iGo9CjPHWmU7fnbV5lmigkaDSKtJMIgUxdaZZ75p7u3HkkTTMfzlHdIrTsf5f X-Received: by 10.28.180.193 with SMTP id d184mr1463513wmf.64.1454978754448; Mon, 08 Feb 2016 16:45:54 -0800 (PST) Received: from localhost (cpc81211-farn9-2-0-cust890.6-2.cable.virginm.net. [86.22.207.123]) by smtp.gmail.com with ESMTPSA id h8sm32134725wjw.6.2016.02.08.16.45.53 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 08 Feb 2016 16:45:53 -0800 (PST) From: Philip Martin To: Yves Martin Cc: dev@subversion.apache.org Subject: Re: Subversion FSFS logical addressing and packed shard References: <1454879812.25042.20.camel@gmail.com> Date: Tue, 09 Feb 2016 00:45:52 +0000 In-Reply-To: <1454879812.25042.20.camel@gmail.com> (Yves Martin's message of "Sun, 07 Feb 2016 22:16:52 +0100") Message-ID: <87io1yd83j.fsf@wandisco.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain Yves Martin writes: > I would like to better understand how logical addressing has impact on > packed shard. My objective is to provide a "unpack" feature with logical > addressing in my version of "fsfs-reshard.py" script: > https://github.com/ymartin59/svn-fsfs-reshard > > Do you have any hints to help me in that job or do you already know it > is irrelevant and should be considered as a pure waste of time ? Logical addressing refers to items in the revision file by an index number rather than an offset. The revision file also contains an index map that allows index numbers to be converted to offsets and offsets to be converted to index numbers. The index map also contains the length of each item and the revision number; the revision number is trivial for an unpacked revision file. A pack file has a similar index map but in this case the revision number varies. The 1.9 tool svnfsfs can dump and load the index maps of revision and pack files. An example (shard size 4): $ svnfsfs dump-index repo 1 Start Length Type Revision Item Checksum 0 2a chgs 3 1 5f5b9c31 2a 2a chgs 2 1 efee8d5b 54 2a chgs 1 1 eee1b382 7e 1 chgs 0 1 f28a4f1d 7f 79 node 3 2 7e6fca28 f8 72 drep 3 5 21933af7 16a 55 drep 2 5 6f371fa3 1bf 39 drep 1 5 8da855e0 1f8 11 drep 0 3 60232b75 209 9d node 1 4 d684e01d 2a6 1b frep 1 3 1823e0a0 2c1 9d node 2 4 3bd76335 35e 1b frep 2 3 5b6fd650 379 9d node 3 4 70fb00b0 416 1b frep 3 3 1f9eb8e6 431 78 node 2 2 7c048873 4a9 78 node 1 2 cde8ee37 521 59 node 0 2 403dbe48 Note that the items that make up a revision are not consecutive in the pack file. In principal the unpack is not hard. Read the index map from the pack file. Then construct the revision files by extracting items from the pack files and adding them to revision files, keeping track of the new offsets. Do one revision file at a time or multiple revision files in parallel. Once all the items are present in a revision file construct the new index maps for each revision file. It might be tricky to implement this in Python simply because you need code to dump and load the index maps. You would have to write that code from scratch, or run the svnfsfs tool, or write a Python binding to the C code. An alternative would be to implement an unpack operation for svnadmin in C and use the existing C code to handle the index maps. -- Philip Martin WANdisco