On Sun, Aug 10, 2014 at 3:56 PM, Evgeny Kotkov <evgeny.kotkov@visualsvn.com> wrote:
Hi Stefan,

> Thanks for having a the (missing) instance ID issue.
> From initial review, I have 1 objection and 2 issues
> that your patch does not address, yet.

Thank you for reviewing this patch.  I did my best, but I cannot really tell,
which of these points are objections, and which of them are issues with the
proposed patch.

No problem, I usually answer questions - eventually ;)
(It all depends on whether an the answers require
thought, what my TODO list or travel schedule is etc.)
 
This patch solves a well-defined problem that was particularly stated in
the comments prior to r1589262:
[[[
  Using the uuid to obtain the lock creates a corner case if a
  caller uses svn_fs_set_uuid on the repository in a process where
  other threads might be using the same repository through another
  FS object. The only real-world consumer of svn_fs_set_uuid is
  "svnadmin load", so this is a low-priority problem, and we don't
  know of a better way of associating such data with the
  repository.
]]]

It's perfectly fine to fix that issue. Doing so introduces
a new feature ("instance ID") and I think the code should
cover related issues as well. It should  take only a few
extra lines to fix the whole class of related problems.

Just to mention, this comment was not restored in r1589518, although the
changeset was claimed to be a "revert".

Thanks for spotting it! r1589518 was not a strict revert.
That's how the old comment fell through the cracks.
Fixed in r1617586.
 
 And (just to mention #2) without a
change like this we *do* have a regression in the incremental hotcopy behavior
compared to 1.8, because now the destination is allowed to be packed while
hotcopy is in progress, and that's where real "ripple effects" become visible.

That sounds like a separate issue. Hotcopy should take
out the pack lock on the target as well. Feel free to fix it.
 
> * Ideally, we would store the instance UUID as a separate
>   file next to uuid and friends. Presumably trying to not
>   increase the fs_open latency low, you merged that info
>   into the format file - causing various ripple effects.
>   I think it should be put into the UUID file (or a separate
>   file altogether) and update places where we rewrite
>   this file (e.g. svnadmin load).

Honestly, I do not get this — what exactly are the "ripple effects" you are
talking about?  Don't you think that adding another file would not have its own
"ripple effects", at least considering the amount of places where you have to
change the code?  And again, I still do not see any real problems with the
approach I chose.

The instance ID has nothing to do with format info. Agree?
Putting it there is just plain wrong.

Putting it into the same file means that you have to filter it
whenever you want to change or check one but not the other.
All of your changes in the fs-fs-pack-test were only needed
for that reason. But these "ripple effects" are only a symptom
of a bad design choice. Having e.g. a completely separate
file would not require any changes to the existing code to
maintain the current functionality.
 
>From my point of view, it is not even a matter of taste.  We require the
'instance-uuid' option as a solution to our *internal* problem with the shared
data clashes (and, possibly, with outdated cache contents after the repository
replacement).  Putting a thing like this into the 'db/format' file has at least
a couple advantages:

- You do not have to open yet-another-file within fs_open() calls.  These are
  frequent — think about how often Apache HTTP Server does it.  So, why don't
  we entirely avoid this overhead, considering the fact that it does not make
  any real difference?

Putting it into the existing uuid file would be just as efficient
and touch fewer code bits.
 
- This actually makes sense.

No. It is does not tell us anything about the FS format.
 
 We have the filesystem UUID stored in a separate
  'db/uuid' file and it can be changed by the end user (svnadmin setuuid).  We
  also have the instance UUID, which is invisible for the user and solves *our*
  problems.  Whenever we use it appropriately in our caches / shared data, the
  end user should not *ever* want to change the instance UUID, and, to my mind,
  we should not introduce another user-visible thing and a corresponding set of
  commands for the administrator to remember.

I'm not 100% sure whether we should expose the
instance ID to the user or not. The use case that
I am unsure about is backup (from hotcopy or not).

Technically, we all working copies become invalid
and 'svnadmin setuuid' makes that explicit. But there
are certainly users who would like to keep their
working copies intact. I honestly have no idea what
the correct approach here is.
 
Finally, I do not really understand the 'svnadmin load' point.  You do not have
to somehow interact with an instance UUID whenever you load something into a
repository, do you?

See above. As soon as you restore from backup,
any internal cache is (potentially) invalid now. An
ID bump (uuid or instance ID) is needed to prevent
disaster unless you restart your server processes.
 
> * The instance ID must become part of the cache key.
>   This is for the hot-restore-from-backup use-case where
>   a repository gets replaced with an older version of itself
>   while the server process is kept alive (caches are still hot).

This is irrelevant to this patch.

I disagree. Although you did not intend to address
the caching issue, it is basically the flip side of the
shared data aliasing problem that you try to solve.
 
 As a matter of fact, this patch also does not
solve the move tracking problem.

How would this even be related to move tracking?
Do you refer to renaming repositories on disk?
 
 Using this new UUID in our caches seems like a
logical next step and I actually had that in mind for implementing.

O.k. And that is what review is for: discussing the
consequences and potentially related issues. I.e.
sharing insight and ideas; it is *not* intended for blaming people
for anything.
 
 However,
we can solve all these problems one-by-one, considering the fact that this patch
is an atomic change solving *one* exact problem.

> * svnadmin should have a means to bump the instance ID.
>   Again, the restore-from-backup use-case.

See above.  Again, I cannot tell whether this is an issue or an objection, but
"should" looks unjustified.  As I understand, you are talking about a triple
edge case:

1. An administrator backups the repository by copying them (making disk images,
   etc.) instead of using the 'hotcopy' command we provide.

2. Something goes wrong, and he/she restores the backup by hot swapping the
   repository *without* restarting the server.

3. Things start to go wrong with the cached data left from the previous
   repository before the swap happened.

There are two ways of solving this — you can restart the server (which is a
better choice) or give the repository a brand new UUID with 'svnadmin setuuid'
prior to the swap.  Hence, I do not see a necessity in another command that
would allow us to change the instance UUIDs.  Administrators would have to
learn about the difference between filesystem and instance UUIDs, which is
pretty much unnecessary, because they *do not* have to know or use these
knowledge in order to solve the problems with UUID clashes; they already
have a way of fixing them.


Regards,
Evgeny Kotkov