subversion-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Branko Čibej <>
Subject Re: Symmetry between dump and load
Date Fri, 19 Dec 2014 13:43:19 GMT
On 19.12.2014 13:23, Julian Foad wrote:
> I believe the following symmetries should be true, and testable, and we should test them.
> For any valid repository:
>   * we can dump it
>   * we can load the dump file into a new repository
>   * the new repo is equivalent to the old repo
> For any valid dump file:
>   * we can load it into a new repository
>   * we can dump that repository
>   * the new dump file is equivalent to the old dump file

I agree that this should be our goal. However, consider that some of
these symmetries depend on specific features of the repository

For example, at some point you mentioned dump files with non-UTF-8
paths. Such dump files are clearly invalid, since we've maintained the
restriction that all strings used internally must be encoded in UTF-8.
So, such a dump file can only be the result of manual fiddling, or a bug
in some version of some repository back-end implementation. A different
and/or fixed backend will not accept non-UTF-8 paths at all; thus, we
cannot maintain this particular symmetry.

Conversely, if we decide that maintaining strict dump/load symmetry is
more important, we're—unnecessarily, IMO—complicating future development
(e.g., the idea that repos path lookup should preserve but ignore
differences in Unicode character representation).

I'm sure there are other cases where maintaining strict symmetry will
turn out to be too constraining. An example from your own bailiwick:
when we store mergeinfo in a more reasonable structure than a versioned
property, a load from an older dumpfile will most likely loose details
of exactly how the mergeinfo was represented; even though a later dump
may produce svn:mergeinfo values that are different but semantically
equivalent to the original.

Clearly, dump/load asymmetry can be preserved even in the cases I
mentioned, at the cost of maintaining more complex medatada (and related
code) in the repository back-end. The question we have to answer is:
what's the point, as long as semantics are not affected?

-- Brane

View raw message