subversion-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Evgeny Kotkov <evgeny.kot...@visualsvn.com>
Subject Re: No-op changes no longer dumped by 'svnadmin dump' in 1.9
Date Sun, 04 Oct 2015 14:28:36 GMT
Stefan Fuhrmann <stefan.fuhrmann@wandisco.com> writes:

> Right now, we are losing information. In rare cases and probably nothing too
> important but still. This aspect makes me consider the current behaviour a
> bug. Whether creating that situation in the first place was o.k. nor not is
> a separate issue.

To my mind, the current behavior is a regression, because:

- svnadmin dump produces different dumps for the same repository, depending
  on the version (1.8 or 1.9)
- svnadmin dump and svnrdump dump in 1.9 produce different dumps for the
  same repository
- After a dump-load sequence with svnadmin from 1.9, the resulting repository
  behaves differently from the original in operations like 'svn log'

I committed a failing test in r1706428, and I also think that we have quite
a problem to solve — see below.

[...]

> I suggest that we apply the patch below. It is more efficient that the
> original behaviour but ensures that "no-op" changes will be retained.

[...]

> V2 of the patch also covers no-op prop changes to directories.

Although the V2 patch seems to mitigate the problem in one particular case,
it does not solve it generally.  For instance, here is another scenario where
1.8 and 1.9 binaries produce different dumps, even with the patch applied:

  svnmucc cp 1 bar baz put empty-file baz

  Node-copyfrom-path: bar
  Text-copy-source-md5: d41d8cd98f00b204e9800998ecf8427e
  Text-copy-source-sha1: da39a3ee5e6b4b0d3255bfef95601890afd80709
 -Text-content-length: 0
 -Text-content-md5: d41d8cd98f00b204e9800998ecf8427e
 -Text-content-sha1: da39a3ee5e6b4b0d3255bfef95601890afd80709
 -Content-length: 0

I examined the history of fixing problems caused by r1572363 + r1573111, and
this is the second time when we find ourselves whack-a-moling the calling sites
of svn_fs_props_different() and svn_fs_contents_different().  Previously, we
had issues with misbehaving 'svn blame -g' [1].  This ended up with placing
a hack in subversion/libsvn_repos/rev_hunt.c, and with rather questionable
compatibility promises for svn_ra_get_file_revs2():

[[[
  else if (sb->include_merged_revisions
           && strcmp(sb->last_path, path_rev->path))
    {
      /* ### This is a HACK!!!
       * Blame -g, in older clients anyways, relies on getting a notification
       * whenever the path changes - even if there was no content change.
       *
       * TODO: A future release should take an extra parameter and depending
       * on that either always send a text delta or only send it if there
       * is a difference. */
      contents_changed = TRUE;
    }

 ...

 * @note Prior to Subversion 1.9, this function may request delta handlers
 * from @a handler even for empty text deltas.  Starting with 1.9, the
 * delta handler / baton return arguments passed to @a handler will be
 * #NULL unless there is an actual difference in the file contents between
 * the current and the previous call.
 *
 * @since New in 1.5.
 */
svn_error_t *
svn_ra_get_file_revs2(svn_ra_session_t *session,
                      const char *path,
                      ...

]]]

The r1572363 + r1573111 patchset implicitly changes behavior of around six
different callers, and two of them have already backfired with problems.
Moreover, apart from switching everything to the new API, these revisions
change the behavior of *existing* API like svn_fs_contents_changed().  We
have an erratum describing the change [2], but I doubt that API users can
properly adjust their code, if they notice this change — because even we failed
to do so.

Finally, I cannot understand the practical benefits from doing this.  If these
changes are aimed at making the checks more strict, but break existing code,
I don't think we should be doing this.  If this is an optimization, I'd favor
correctness over it.

I propose that, instead of patching the callers in response to problems, we
restore every relevant part to its 1.8.x state, where they're known to be stable
and work.  This would allow us to remove the hacks like the one associated with
fixing svn blame -g, would restore the proper behavior of svnadmin dump, and
would also allow us not to deal with similar problems in other places, if they
exist.

How does this sound?

[1] http://svn.haxx.se/dev/archive-2015-06/0069.shtml
[2] http://svn.apache.org/repos/asf/subversion/trunk/notes/api-errata/1.9/fs001.txt


Regards,
Evgeny Kotkov

Mime
View raw message