subversion-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Luke Perkins" <lukeperk...@epicdgs.us>
Subject [jira] [Commented] (SVN-4668) svnserve dump format order has changed
Date Mon, 09 Jan 2017 16:09:41 GMT
From: Luke Perkins [mailto:lukeperkins@epicdgs.us] 
Sent: Monday, January 9, 2017 06:33
Cc: 'jira@apache.org' <jira@apache.org>
Subject: RE: [jira] [Commented] (SVN-4668) svnserve dump format order has changed

I think the problem at hand is the following section of code located in libsvn_repos/dump.c
starting at line #405. There was a significant rewrite to this section of code back in January
of 2015 by a user "julianfoad". Still working on determining root-cause, however, it appears
that a directive "Content-length must be last." Is the key-phrase that is reordering the SVN
dump records format.

The old format order was:

1) Prop-content-length
2) Text-content-length
3) Text-content-sha1
4) Text-content-md5

Now the format order is:

1) Text-content-length
2) Text-content-sha1
3) Text-content-md5
4) Prop-content-length

/* Write headers, in arbitrary order.
 * ### TODO: use a stable order
 * ### Modifies HEADERS.
 */
static svn_error_t *
write_revision_headers(svn_stream_t *stream,
                       apr_hash_t *headers,
                       apr_pool_t *scratch_pool) {
  const char **h;
  apr_hash_index_t *hi;

  static const char *revision_headers_order[] =
  {
    SVN_REPOS_DUMPFILE_REVISION_NUMBER,  /* must be first */
    NULL
  };

  /* Write some headers in a given order */
  for (h = revision_headers_order; *h; h++)
    {
      SVN_ERR(write_header(stream, headers, *h, scratch_pool));
      svn_hash_sets(headers, *h, NULL);
    }

  /* Write any and all remaining headers except Content-length.
   * ### TODO: use a stable order
   */
  for (hi = apr_hash_first(scratch_pool, headers); hi; hi = apr_hash_next(hi))
    {
      const char *key = apr_hash_this_key(hi);

      if (strcmp(key, SVN_REPOS_DUMPFILE_CONTENT_LENGTH) != 0)
        SVN_ERR(write_header(stream, headers, key, scratch_pool));
    }

  /* Content-length must be last */
  SVN_ERR(write_header(stream, headers, SVN_REPOS_DUMPFILE_CONTENT_LENGTH,
                       scratch_pool));

  return SVN_NO_ERROR;
}

Thank-you,

Luke Perkins

-----Original Message-----
From: Bert Huijben (JIRA) [mailto:jira@apache.org]
Sent: Monday, January 9, 2017 03:52
To: lukeperkins@epicdgs.us
Subject: [jira] [Commented] (SVN-4668) svnserve dump format order has changed


    [ https://issues.apache.org/jira/browse/SVN-4668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811591#comment-15811591
] 

Bert Huijben commented on SVN-4668:
-----------------------------------

One of the problems here is that we never explicitly coded dump to be in strict order. The
code used to iterate the members of directories in the order they were placed in an APR hashtable.
Then at one point APR changed its implementation from mostly stable to randomly changing to
avoid attacks in certain usecases of hashtables. The dumpfiles were still valid at this point,
but some operations might be in a different order. But all this technically produces 100%
the same commits.

When we found this problem in Subversion in operations like 'svn status -U' we changed some
parts of our code to start producing a strict stable order, but this new order is different
than the one that used to be produced by the old apr hashtable implementation. I'm not sure
if the replay api was (already) changed for this.

In Subversion 1.9, as part of optimizing fsfs the filesystem layer can now produce an 'optimal
ordering' of members of a directory for cheap access on the filesystem layer... This might
have changed the ordering again... and/or... change the ordering again in the future.

Other 1.9 work includes making the svnadmin dump format more stable between the different
producers (svnadmin dump, svnrdump dump)

I'll try to add a few interesting issue numbers to this issue. But I think we should discuss
this on the users or dev list first before proposing to 'fix' this, as I don't see a simple
fix that works for all usecases.

> svnserve dump format order has changed
> --------------------------------------
>
>                 Key: SVN-4668
>                 URL: https://issues.apache.org/jira/browse/SVN-4668
>             Project: Subversion
>          Issue Type: Bug
>          Components: svnserve
>    Affects Versions: 1.9.3
>         Environment:  Ubuntu 16.04.1 LTS (GNU/Linux 4.4.0-53-generic x86_64)
>            Reporter: Luke Perkins
>         Attachments: SvnserveDumpIssue_20170107.jpg
>
>
> The format of the svnserve dump file has changed somewhere between version 1.8 and 1.9.3
( version 1.9.3 (r1718519)). I routinely perform svnserve dump operations of my repositories
and compare them against archived copies of dump files to be used for emergency recovery operations.
> It appears the content order difference is benign other than "diff" operations fail.
I have file illustrating the difference.
> The version information for svnserve dump is:
> svnserve, version 1.9.3 (r1718519)
>    compiled Mar 14 2016, 07:39:01 on x86_64-pc-linux-gnu Copyright (C) 
> 2015 The Apache Software Foundation.
> This software consists of contributions made by many people; see the 
> NOTICE file for more information.
> Subversion is open source software, see http://subversion.apache.org/ 
> The following repository back-end (FS) modules are available:
> * fs_fs : Module for working with a plain file (FSFS) repository.
> * fs_x : Module for working with an experimental (FSX) repository.
> * fs_base : Module for working with a Berkeley DB repository.
> Cyrus SASL authentication is available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Mime
View raw message