From dev-return-39544-archive-asf-public=cust-asf.ponee.io@subversion.apache.org Fri Oct 11 14:56:34 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id B13CF180658 for ; Fri, 11 Oct 2019 16:56:33 +0200 (CEST) Received: (qmail 57720 invoked by uid 500); 11 Oct 2019 14:56:33 -0000 Mailing-List: contact dev-help@subversion.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list dev@subversion.apache.org Received: (qmail 57709 invoked by uid 99); 11 Oct 2019 14:56:32 -0000 Received: from Unknown (HELO mailrelay1-lw-us.apache.org) (10.10.3.42) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Oct 2019 14:56:32 +0000 Received: from [192.168.1.106] (unknown [81.174.159.228]) by mailrelay1-lw-us.apache.org (ASF Mail Server at mailrelay1-lw-us.apache.org) with ESMTPSA id 41FDF5A6D; Fri, 11 Oct 2019 14:56:32 +0000 (UTC) To: "Eric S. Raymond" Cc: dev@subversion.apache.org From: Julian Foad Subject: Subversion semantics: no no-op changes Message-ID: <0eb878c1-5c06-ce12-1a4d-ae11ebb94071@apache.org> Date: Fri, 11 Oct 2019 15:56:31 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 7bit Hello Eric. TL;DR: I explain why I am convinced no-op changes don't belong in the Subversion versioning semantics. With your work on Subversion repository and dump stream semantics, is this something you can offer a view on? I have previously failed to convince the developer community [1]. In examining the Subversion versioned data semantics and how the protocols and APIs represent them, I have come across a number of kinds of what could be called a "no-op change" or perhaps better described as "I touched this but did not change its value". Example: - I changed the text of file F from T1 to T1; - now "svn log -v" tells me the text of F was "modified"; - some variants of "svn diff" show no output; - some variants of "svn diff" show a diff header with no body. That is the best known user-visible example. Other kinds are possible too, and a number of examples exist on the server side, e.g. [#4623]. A Subversion client generally does not send no-op changes to a repository, but in certain cases it does. A Subversion repository generally does not record and play back any no-op changes that may be sent to it, but in certain cases it does. I am convinced "no-op changes" should be considered meaningless and removed from the data model presented to the user. In protocols and APIs, a no-op change should be considered a non-canonical form and a transient implementation detail of that particular interface, and implementations should not attempt to preserve it. In the rest of this note I try to explain some angles to the issue. The Subversion system is built on a main design principle of tree snapshots and differencing and merging of trees. A no-op change is out-of-tree metadata about certain pairs of trees. Carrying such metadata around the system in general is fundamentally incompatible with that principle. One practical reason the existing system does not preserve that metadata is because, with very few exceptions, the existing interfaces convey no-op changes only implicitly, as a side effect of how their explicit operations are formulated, and so one differs from another. For example, an interface that represents a file change through multiple optional operations, one of which is "on file F, property P changes to value V" can convey "property P1's value no-op-changed from V1 to V1, while property P2's value was not touched" if we invoke the "change property" operation for property P1 but do not invoke it for P2. On the other hand, an interface that represents a file change as a single operation, "new file := {text, {properties}}" cannot; the only no-op change it can convey is at a coarser granularity, "file F no-op-changed its value from {T1,PROPS1} to {T1,PROPS1}". The kinds of no-op change an interface can convey locally is an implementation detail of that particular interface, and so cannot be expected to match any other interface unless explicitly required and tested, which they mostly are not. Because the existing interfaces convey no-op-change information only incidentally, the system cannot be expected to preserve any particular no-op change when data flows through multiple interfaces, through commit, checkout, branch, diff, merge, and so on. Subversion only preserves some within very limited scopes (such as the "file changed" flag in the "changed paths" list in "svn log -v"). Some of the existing svn protocols and APIs explicitly preserve certain no-op changes. For example, one user reported [2] that in their svn history (converted from CVS) they would "hate to lose" the historical record that "svn log -v" reports "file text changed" for a certain no-op file change. When I eliminated this no-op change from "dump", without due care to backward compatibility, it was considered a regression and reverted [#4598]. There are valid arguments for preserving backward compatibility in some places. However, I propose such behaviour should be considered obsolete and broken, and a migration path should be planned to get away from it. The snapshots argument is diluted because we already have at least one other kind of metadata outside a pure tree-snapshots system: the "copy-from" links. I am not immediately planning to ditch copy-from links, though I think there are good reasons, analogous to the reasoning about no-op changes, to replace them in a possible future system. I have given some thought to it. That would be a more visible change to the system, of course, though not so much as it might first appear. The example of a no-op file text change is a simple one. An example with deeper implications is a directory copy combined with replacing one of its implicitly copied child with an explicit copy of that child from the same source as it was implicitly copied. Addressing a case like this may be as simple as declaring one version as the canonical form, or may require further travel down the road of copy-from semantics. In conclusion, I consider svn would be a better system -- more predictable, testable, composable, etc.; more generally dependable -- and would lose no significant value at all -- if we were to explicitly remove no-op changes. Does this all ring true and obvious to you, or can you explain better what I am getting at and what I'm missing? - Julian [1] Email: "No no-op changes", from me to dev@, 2014-09-19, https://svn.haxx.se/dev/archive-2014-09/0082.shtml https://mail-archives.apache.org/mod_mbox/subversion-dev/201409.mbox/%3c1411138196.98623.YahooMailNeo@web87703.mail.ir2.yahoo.com%3e [2] Email: "No-op changes no longer dumped by 'svnadmin dump' in 1.9", from Johan Corveleyn to dev@, 2015-09-21, https://svn.haxx.se/dev/archive-2015-09/0269.shtml http://mail-archives.apache.org/mod_mbox/subversion-dev/201509.mbox/%3CCAB84uBVe8QnEpbPVAb__yQjiDDoYjFn2+M9mPcdBXZCwMCpOLw@mail.gmail.com%3E [#4598] "No-op changes no longer dumped by 'svnadmin dump' in 1.9", https://subversion.apache.org/issue/4598 https://issues.apache.org/jira/browse/SVN-4598 [#4623] "no-op prop change not preserved across dump/load" https://subversion.apache.org/issue/4623 https://issues.apache.org/jira/browse/SVN-4623