Return-Path: X-Original-To: apmail-subversion-commits-archive@minotaur.apache.org Delivered-To: apmail-subversion-commits-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DE51E11666 for ; Thu, 26 Jun 2014 15:09:05 +0000 (UTC) Received: (qmail 91008 invoked by uid 500); 26 Jun 2014 15:09:05 -0000 Delivered-To: apmail-subversion-commits-archive@subversion.apache.org Received: (qmail 90981 invoked by uid 500); 26 Jun 2014 15:09:05 -0000 Mailing-List: contact commits-help@subversion.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@subversion.apache.org Delivered-To: mailing list commits@subversion.apache.org Received: (qmail 90971 invoked by uid 99); 26 Jun 2014 15:09:05 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Jun 2014 15:09:05 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of stefan.fuhrmann@wandisco.com designates 209.85.213.180 as permitted sender) Received: from [209.85.213.180] (HELO mail-ig0-f180.google.com) (209.85.213.180) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Jun 2014 15:08:59 +0000 Received: by mail-ig0-f180.google.com with SMTP id h18so823283igc.1 for ; Thu, 26 Jun 2014 08:08:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wandisco.com; s=gapps; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=1vlsvU/nr3H7pMgI1KWQQodQHD/MH/bn25xKW/qBIXo=; b=nE9uWB3Wu8/9rZPMjpc3gRUIlx8mIWNtAr2X5ltk8XkyYRIIOaZREmi1XY9chcVSPQ ztM55ipjEruwJ7QOwCCbSvbRyxLOMbdoZJMDMPAhnfViH4O0kL6rXDTKhAL2x8z8yD03 7JWc8HboxTjT/6ZVDkp4VVcI6b+IsW5VmovvY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=1vlsvU/nr3H7pMgI1KWQQodQHD/MH/bn25xKW/qBIXo=; b=EzPJ+TMcjik+96QaIrFd5gSzFcMnYGUq0exL1WvlTAufX1Oku5EcdmmgWocbmzFXD/ YOZOvTtT48pSQjiMKgbhQj3AZNYYBVAHlV93xZH+WhTcdYpmPFQbEW2K+748EWx54hwj Yo57l15IhT5lTCNyXw/msUYB/w5neaUpnOdd+/X65yvbm8CNqPRUkE1D7WiQIFYyIfgc FQ7NhUsqZY0EDQJivq+PXS1PaobtSaiEJq1Mf0Kg5ocrncf7y0xfVmad5EaKBvGqPmLg qQ3uKbMWySxJRUwvGnDFrTxOfRS5cHraBXOS786pfY/bgIi3rF6ifp4oQZCNfGgCt0iy p5vw== X-Gm-Message-State: ALoCoQkKG0YrSmAlrHwX4ltKLold6IVlkChmS1s32sLq2Bd/6Y2XeSCO37EqQUZNiymWTZGIrV3Q MIME-Version: 1.0 X-Received: by 10.43.154.207 with SMTP id lf15mr2869404icc.92.1403795318415; Thu, 26 Jun 2014 08:08:38 -0700 (PDT) Received: by 10.50.231.131 with HTTP; Thu, 26 Jun 2014 08:08:38 -0700 (PDT) In-Reply-To: References: <20140621151532.86D5C2388980@eris.apache.org> <20140625181139.GE1843@tarsus.local2> Date: Thu, 26 Jun 2014 17:08:38 +0200 Message-ID: Subject: Re: svn commit: r1604421 - in /subversion/trunk: subversion/libsvn_fs_fs/ subversion/tests/cmdline/ subversion/tests/libsvn_fs_fs/ tools/server-side/svnfsfs/ From: Stefan Fuhrmann To: Ivan Zhakov Cc: Daniel Shahaf , Subversion Development , "commits@subversion.apache.org" Content-Type: multipart/alternative; boundary=001a11c2de944ac15f04fcbe910d X-Virus-Checked: Checked by ClamAV on apache.org --001a11c2de944ac15f04fcbe910d Content-Type: text/plain; charset=UTF-8 On Thu, Jun 26, 2014 at 11:52 AM, Ivan Zhakov wrote: > On 25 June 2014 22:11, Daniel Shahaf wrote: > > Ivan Zhakov wrote on Wed, Jun 25, 2014 at 20:03:11 +0400: > >> On 25 June 2014 19:54, Stefan Fuhrmann > wrote: > >> > On Wed, Jun 25, 2014 at 5:46 PM, Ivan Zhakov > wrote: > >> >> > >> >> On 25 June 2014 19:24, Stefan Fuhrmann > > >> >> wrote: > >> >> > On Wed, Jun 25, 2014 at 4:57 PM, Ivan Zhakov > wrote: > >> >> >> > >> >> >> On 21 June 2014 17:15, wrote: > >> >> >> > Author: stefan2 > >> >> >> > Date: Sat Jun 21 15:15:30 2014 > >> >> >> > New Revision: 1604421 > >> >> >> > > >> >> >> > URL: http://svn.apache.org/r1604421 > >> >> >> > Log: > >> >> >> > Append index data to the rev data file instead of using separate > >> >> >> > files. > > >> >> >> > > > >============================================================================== > >> >> >> >--- subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c (original) > >> >> >> >+++ subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c Sat Jun 21 > >> >> >> > 15:15:30 > >> >> >> > 2014 > >> >> >> >@@ -1449,15 +1449,35 @@ write_revision_zero(svn_fs_t *fs) > >> >> >> >+ SVN_ERR(svn_io_file_create_binary(path_revision_zero, > >> >> >> >+ "PLAIN\nEND\nENDREP\n" > >> >> >> >+ "id: 0.0.r0/2\n" > >> >> >> >+ "type: dir\n" > >> >> >> >+ "count: 0\n" > >> >> >> >+ "text: 0 3 4 4 " > >> >> >> >+ "2d2977d1c96f487abe4a1e202dd03b4e\n" > >> >> >> >+ "cpath: /\n" > >> >> >> >+ "\n\n" > >> >> >> >+ > >> >> >> >+ /* L2P index */ > >> >> >> >+ "\0\x80\x40" /* rev 0, 8k entries > per page > >> >> >> > */ > >> >> >> >+ "\1\1\1" /* 1 rev, 1 page, 1 > page in > >> >> >> > 1st > >> >> >> > rev */ > >> >> >> >+ "\6\4" /* page size: bytes, > count */ > >> >> >> >+ "\0\xd6\1\xb1\1\x21" /* phys offsets + 1 */ > >> >> >> >+ > >> >> >> >+ /* P2L index */ > >> >> >> >+ "\0\x6b" /* start rev, rev file > size > >> >> >> > */ > >> >> >> >+ "\x80\x80\4\1\x1D" /* 64k pages, 1 page > using 29 > >> >> >> > bytes */ > >> >> >> >+ "\0" /* offset entry 0 page > 1 */ > >> >> >> >+ /* len, item & type, > rev, > >> >> >> > checksum */ > >> >> >> >+ "\x11\x34\0\xf5\xd6\x8c\x81\x06" > >> >> >> >+ "\x59\x09\0\xc8\xfc\xf6\x81\x04" > >> >> >> >+ "\1\x0d\0\x9d\x9e\xa9\x94\x0f" > >> >> >> >+ "\x95\xff\3\x1b\0\0" /* last entry fills up > 64k > >> >> >> > page > >> >> >> > */ > >> >> >> >+ > >> >> >> >+ /* Footer */ > >> >> >> >+ "107 121\7", > >> >> >> >+ 107 + 14 + 38 + 7 + 1, fs->pool)); > >> >> >> This constant makes code unreadable, unmaintainable and very error > >> >> >> prone. > > ... > >> I saw it, but it doesn't make code easier to maintain. > > > > Ivan, that's the third time in as many emails that you say "the new code > > is hard to maintain". Could you please explain *how* you find it hard > > to maintain? > > > > Anyway, that's just me guessing. Could you please clarify what exactly > > makes the code unmaintainable in your opinion? > > > In my opinion 'code maintainability' is how easy or hard to make > change and review the code. This magic constant with a lot 7b encoded > numbers are very hard to review and modify if needed in future. Even > Stefan2 mentioned that he made slight mistakes when changing that > constant that was caught by test suite. But I don't assume that code > that passes all test suite doesn't have bugs, while Stefan2 seems to > assume. I bet that nobody reviewed "\x11\x34\0\xf5\xd6\x8c\x81\x06" > constant in the code above. Even I didn't, because I think it's a > waste of time and proper code should be committed or change reverted. > Hi Ivan, I see three alternative ways to code that function 1. As hard coded string / byte sequence (current implementation). Cons: * Hard to write, hard to review by just looking at it (applies to time until initial release only). Pros: * Explicitly coded as constant, deterring people from changing it. * Independent of other code, i.e. unintended changes to the format / encoding generated by the normal code usually become apparent when running the test suite. 2. Use our txn code to write r0. This should be simple and might at most require some special ID handling. Cons: * Generating incompatible r0s is likely not caught by our test suite (assuming that reader and writer functions are in sync). Basically all the risk that comes with dynamically generating a "constant". * Test cases must make sure our backward compat repo creation options create repos that can actually be used by old releases. (we might want explicit test for that anyway, though) Pros: * No or very little special code for r0 (not sure, yet). * Format / encoding changes don't require new r0 templates. 3. Write code to "stitch" r0 together, e.g. string_add(md5("END\n")). Cons: * No more robust than 1. but requires much more code. * May _look_ easy to understand but an actual offline review is still hard. Pros: * Widely independent of other code, although not as much as 1. Do you generally agree with the range of options and their assessment? Which one would you pick and why? I'd be fine with switching to option 2 as long as everyone understands the implications. -- Stefan^2. --001a11c2de944ac15f04fcbe910d Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


On Thu, Jun 26, 2014 at 11:52 AM, Ivan Zhakov <ivan@visualsvn.com= > wrote:
On 2= 5 June 2014 22:11, Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> Ivan Zhakov wrote on Wed, Jun 25, 2014 at 20:03:11 +0400:
>> On 25 June 2014 19:54, Stefan Fuhrmann <stefan.fuhrmann@wandisco.com> wrote:
>> > On Wed, Jun 25, 2014 at 5:46 PM, Ivan Zhakov <ivan@visualsvn.com> wrote:
>> >>
>> >> On 25 June 2014 19:24, Stefan Fuhrmann <stefan.fuhrmann@wandisco.com>
>> >> wrote:
>> >> > On Wed, Jun 25, 2014 at 4:57 PM, Ivan Zhakov <ivan@visualsvn.com> wrote:
>> >> >>
>> >> >> On 21 June 2014 17:15, =C2=A0<stefan2@apache.org> wrote:
>> >> >> > Author: stefan2
>> >> >> > Date: Sat Jun 21 15:15:30 2014
>> >> >> > New Revision: 1604421
>> >> >> >
>> >> >> > URL: http://svn.apache.org/r1604421
>> >> >> > Log:
>> >> >> > Append index data to the rev data file inst= ead of using separate
>> >> >> > files.
=C2= =A0
>> >> >> > > >=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>> >> >> >--- subversion/trunk/subversion/libsvn_fs_fs= /fs_fs.c (original)
>> >> >> >+++ subversion/trunk/subversion/libsvn_fs_fs= /fs_fs.c Sat Jun 21
>> >> >> > 15:15:30
>> >> >> > 2014
>> >> >> >@@ -1449,15 +1449,35 @@ write_revision_zero(= svn_fs_t *fs)
>> >> >> >+ =C2=A0 =C2=A0SVN_ERR(svn_io_file_create_bi= nary(path_revision_zero,
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0"PLAIN\nEND\nENDREP\n"
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0"id: 0.0.r0/2\n"
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0"type: dir\n"
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0"count: 0\n"
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0"text: 0 3 4 4 "
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0"2d2977d1c96f487abe4a1e202dd03b4e\n"
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0"cpath: /\n"
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0"\n\n"
>> >> >> >+
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0/* L2P index */
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0"\0\x80\x40" =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0/* rev 0, 8k entries per page
>> >> >> > */
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0"\1\1\1" =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0/* 1 rev, 1 page, 1 page in
>> >> >> > 1st
>> >> >> > rev */
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0"\6\4" =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0/* page size: bytes, count */
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0"\0\xd6\1\xb1\1\x21" =C2=A0/* phys offsets + = 1 */
>> >> >> >+
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0/* P2L index */
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0"\0\x6b" =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0/* start rev, rev file size
>> >> >> > */
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0"\x80\x80\4\1\x1D" =C2=A0 =C2=A0/* 64k pages,= 1 page using 29
>> >> >> > bytes */
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0"\0" =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0/* offset entry 0 page 1 */
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0/* len, item & type, rev,
>> >> >> > checksum */
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0"\x11\x34\0\xf5\xd6\x8c\x81\x06"
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0"\x59\x09\0\xc8\xfc\xf6\x81\x04"
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0"\1\x0d\0\x9d\x9e\xa9\x94\x0f"
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0"\x95\xff\3\x1b\0\0" =C2=A0/* last entry fill= s up 64k
>> >> >> > page
>> >> >> > */
>> >> >> >+
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0/* Footer */
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0"107 121\7",
>> >> >> >+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0107 + 14 + 38 + 7 + 1, fs->pool));
>> >> >> This constant makes code unreadable, unmaintaina= ble and very error
>> >> >> prone.
> ...
>> I saw it, but it doesn't make code easier to maintain.
>
> Ivan, that's the third time in as many emails that you say "t= he new code
> is hard to maintain". =C2=A0Could you please explain *how* you fi= nd it hard
> to maintain?
>
> Anyway, that's just me guessing. =C2= =A0Could you please clarify what exactly
> makes the code unmaintainable in your opinion?
>
In my opinion 'code maintainability' is how easy or hard to m= ake
change and review the code. This magic constant with a lot 7b encoded
numbers are very hard to review and modify if needed in future. Even
Stefan2 mentioned that he made slight mistakes when changing that
constant that was caught by test suite. But I don't assume that code that passes all test suite doesn't have bugs, while Stefan2 seems to assume. I bet that nobody reviewed "\x11\x34\0\xf5\xd6\x8c\x81\x06&quo= t;
constant in the code above. Even I didn't, because I think it's a waste of time and proper code should be committed or change reverted.

Hi Ivan,

I see three alternative ways to code t= hat function

1. As hard coded strin= g / byte sequence (current implementation).
=C2=A0 Cons:
* Hard to write, hard to r= eview by just looking at it (applies to time
=C2=A0 until initial release only).
=C2=A0 Pros:
* Explicitly coded as constant, deterring people from changing it.
* Independent of other code, i.e. unintended ch= anges to the format /
=C2=A0 encoding g= enerated by the normal code usually become apparent
=C2=A0 when running the test suite.
2. Use our txn code to write r0. This sho= uld be simple and might
=C2=A0 at most = require some special ID handling.
=C2=A0 Cons:
* Generating incompatible r0s is likely not caught by our test suite=
=C2=A0 (assuming that reader and write= r functions are in sync). Basically
=C2=A0 all the risk that comes with dynami= cally generating a "constant".
* Test cases must make sure our backward compat repo creation
=C2=A0 options create repos that can actually be used by old releases.
<= /div>
=C2=A0 (we might want explicit test for tha= t anyway, though)
=C2=A0 Pros:
* No or very little special code for r0 (not sure, yet).
* Format / encoding changes don't require new r0 temp= lates.

3. Write code to "stitc= h" r0 together, e.g. string_add(md5("END\n")).
=C2=A0 Cons:
* No more robust than 1. but requires much more code.
* May _look_ easy to understand but an actual offline= review is still hard.
=C2=A0 Pros:
* Widely independent of other code, although not as much as 1.
Do you generally agree with the range of= options and their assessment?
Which one would you pick and why?

<= /div>
I'd be fine with switching to option 2 = as long as everyone understands
the implications.

-- Stefan^2.
--001a11c2de944ac15f04fcbe910d--