subversion-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Johan Corveleyn <jcor...@gmail.com>
Subject Re: upgrade_tests.py 29 spurious failure while testing 1.7.17
Date Tue, 13 May 2014 07:43:43 GMT
On Mon, May 12, 2014 at 9:51 PM, Ben Reser <ben@reser.org> wrote:
> On 5/5/14, 4:24 PM, Johan Corveleyn wrote:
>> As always, I tested with Windows XP (it's end of life, I know ...
>> whatever) on a ramdisk, non-parellel.
>>
>> This time I took a copy of repository and working copy before
>> rerunning the test :-). See attachment. Can anyone shed some light on
>> this?
>>
>> I experimented a bit further with a copy of the repository and working
>> copy of this failed test:
>> - svnadmin verify says everything is ok.
>> - a new svn checkout over svn:// works fine.
>> - executing the failing "svn up" command (the last command of the
>> failure output) on that particular working copy, talking with that
>> particular repository over svn:// ... no problem.
>>
>> So I'm at a loss here. I don't see any corruption, yet the test reported it.
>>
>> Perhaps some kind of cache corruption is a possibility? A theory would
>> be nice ... anything really.
>
> I strongly suspect there is something wrong with your machine (memory going
> bad?).  The repository is nothing more than a dump/load from the greek tree.
> After a dump/load the repository has the UUID set.  No other modifications
> happen to the repo and the only access to the repository via the server is the
> update command that's failing.  That rules out a problem with caching because
> the cache should be entirely cold for this repository when the update command runs.
>
> The error you're getting is:
> svn: E160004: Corrupt node-revision '0.0.r1/4198'
> svn: E160004: Missing id field in node-rev
>
> The closest id in the repository is this: 0.0.r1/4206
>
> The number after the slash is the offset which is stored in the private portion
> of the svn_fs_id_t.  The offset is stored as a apr_off_t (i.e. not a string but
> a integer).
>
> Looking at the offsets in binary yields (leading zeros ommitted):
>
> 4206 = 1 0000 0110 1110
> 4198 = 1 0000 0110 0110
>
> Note that they are off by exactly bit.
>
> A memory issue would probably be very hard to reproduce.  So this seems to fit
> with the issues you've been having.  Combine that with the fact that you've
> been having unreproducible test failures in other places with this setup.  I
> have to conclude you have issues with your memory.  I'd suggest running
> memtest86 on the machine.

First, thanks a lot for taking a look and giving a plausible
explanation. It's a possibility, but I'm not fully convinced yet :-).

Pro:
- It fits theoretically (the one bit off etc).
- It's the only explanation so far. And IIUC cache corruption is ruled out.
- The machine is getting old (almost 8 years now -- I think the memory
is 5 or 6 years old). Its operating system (WinXP) is EOL.

Con:
- I've had zero stability issues with my machine so far. No crashes,
no bluescreens. Not one for as far as I can remember.
- I've been testing / signing svn releases for a couple of years. No
problem, until the last two release cycles or so.
- Ran memtest86 (version 4.0.0 that I still had on some boot CD) last
night. It ran for 8 hours. No errors.

So either my machine really has a memory problem, or it's a unique
machine that can (rarely) reproduce a bug in Subversion. I'm still not
sure. If it's the latter it would be a waste to throw it in the trash
:-). OTOH, if it's such a rare issue that nobody else is seeing this,
maybe it's not worth further precious time (of me and you and others)
...

I'll continue pounding it a bit more, but I'll probably give up at
some point (not determined yet).

-- 
Johan

Mime
View raw message