Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 86B73200B2D for ; Thu, 16 Jun 2016 19:43:05 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 85235160A51; Thu, 16 Jun 2016 17:43:05 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 55BBA1602C5 for ; Thu, 16 Jun 2016 19:43:04 +0200 (CEST) Received: (qmail 47490 invoked by uid 500); 16 Jun 2016 17:43:03 -0000 Mailing-List: contact users-help@subversion.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list users@subversion.apache.org Received: (qmail 47480 invoked by uid 99); 16 Jun 2016 17:43:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Jun 2016 17:43:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id BA0E8181250 for ; Thu, 16 Jun 2016 17:43:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.198 X-Spam-Level: * X-Spam-Status: No, score=1.198 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=wandisco.com Received: from mx2-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id OlQJ5uT1FqbK for ; Thu, 16 Jun 2016 17:42:59 +0000 (UTC) Received: from mail-io0-f169.google.com (mail-io0-f169.google.com [209.85.223.169]) by mx2-lw-eu.apache.org (ASF Mail Server at mx2-lw-eu.apache.org) with ESMTPS id 733F45F1F7 for ; Thu, 16 Jun 2016 17:42:58 +0000 (UTC) Received: by mail-io0-f169.google.com with SMTP id 5so57038083ioy.1 for ; Thu, 16 Jun 2016 10:42:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wandisco.com; s=gapps; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=QgAjJr0c1WZbbHDHJimBWKpQ4YUqPkRSPT5hrpNTGDs=; b=TAIip/QfsaXA3cjhTdem8PjWX9U+40LLXZ7H6wn/J6wqe1fMbG7WBfqCNxl1XpXBJo nH3pHjq02PmI3oRL1Qso//rkZMe2lA5skUONrA7+lVWXmpi622Ia77rFAt0Tah0feoe0 CejwvarWx3HDZ/NOHaQKoJCUXf/Bwsx1s9NCU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=QgAjJr0c1WZbbHDHJimBWKpQ4YUqPkRSPT5hrpNTGDs=; b=dcJ+8DJ08M6Fbld3s1U/UZUSsegfSbL14H5IuhkGaVg+XOHE5OTEPrVRoYokMzZcub L14yDxX08twZ9Bp0exO++u+9rXdaS1o+Kkq4UMenphi4Al04iCdp31QopzUC0BccKWVP Je2CaPOIovN2BHx6Cd4Tj8/1N9Iz2S/aSDuuNs7uDA/rUUKprgybLX7menS61tBPNxpA dMUBzF3AvhZdL0f6CBDYtI8fDk4mfdUkCxX8mZBhf7KTodWswyn73R/lcseg8nL+coJq vNaYaq92g3fYBu9xe0VfVq2dBUEXFeDsenTe3Es8vnJMUf9ZE+tmkiUYuf53QV2tCyxC iqPQ== X-Gm-Message-State: ALyK8tJ6WhHDl9gLq5MFbglfh6phFTV5/8tKwYbqT1421lL8gOPksgOJpmBbAqTOzvwU1Yf3SDNb4+N8jKNkgoW6GdzF8LVBwKWsS7aMwehJ8jr9mfrfAN4QMhpacB1ekd8zp8ctg3yqGVHfhAfhmH4+Tg== X-Received: by 10.107.17.31 with SMTP id z31mr9897687ioi.150.1466098977350; Thu, 16 Jun 2016 10:42:57 -0700 (PDT) MIME-Version: 1.0 Received: by 10.36.111.18 with HTTP; Thu, 16 Jun 2016 10:42:56 -0700 (PDT) In-Reply-To: References: <20160616085126.GM1924@dot.dmz.freshdot.net> <20160616091949.GC5717@ted.stsp.name> <20160616094932.GA20635@dot.dmz.freshdot.net> From: Doug Robinson Date: Thu, 16 Jun 2016 13:42:56 -0400 Message-ID: Subject: Re: Invalid character in hex checksum? To: Johan Corveleyn Cc: "users@subversion.apache.org" Content-Type: multipart/alternative; boundary=001a113f71fec07c8e053568c4a0 archived-at: Thu, 16 Jun 2016 17:43:05 -0000 --001a113f71fec07c8e053568c4a0 Content-Type: text/plain; charset=UTF-8 One thing to remember: if revision properties are being changed (i.e. the pre-revprop-change hook allows them) then those changes will need to be captured and replayed into the loaded repository "somehow" or changes to revisions prior to -rNEXTREV will be lost. On Thu, Jun 16, 2016 at 6:49 AM, Johan Corveleyn wrote: > >> You should rather let the computer spend some time dealing with the dump > >> and load process, instead of spending your own time working around > problems > >> and not even knowing if the result of your workarounds will be OK. > > > > Will have to plan a maintenance window then. > > Gah. :) > > You can do this with a very small maintenance window. The trick is to > dump+load to a new location, while the old repository is still > accessible (for checkouts and commits). After this is done (can take > hours, or even days, whatever) you note the last revision which was > loaded (or check the revision number with 'svnlook youngest > newrepos'), and start another dump+load where you dump with > '--incremental -rNEXTREV:HEAD' (where NEXTREV is the next revision > that needs to be dumped). > > You can iterate over this as long as you keep the old repository open > ... At the end you make the original repository inaccessible for a > couple of minutes, while you enable the new one (Caveat: if you move > your new repos in the same disk location as the old one, and you use > Apache httpd to serve it, make sure you restart httpd to reset its > caches). > > Hint: for a large repo I strongly suggest building the new repository > (the target of your 'svnadmin load') on very fast storage, even > ramdisk if possible (to copy it over to fixed disk afterwards). It's > the 'svnadmin load' part that is very time-consuming right now (this > will probably be much improved in svn 1.10, with the > --no-flush-to-disk option for 'svnadmin load' [1]). > > To be precise, your commands might be: > > 1) svnadmin create NEWREPOS > (maybe create it on a ramdisk) > 2) svnadmin dump -M 1024 OLDREPOS | svnadmin load -M 1024 NEWREPOS > (initial dump+load; you might want to pass -q to dump and/or load > to make it more quiet) > (the -M 1024 gives the process 1024 MB extra ram for caching) > 3) svnlook youngest NEWREPOS > (last rev that was loaded -> NEXTREV is this last rev + 1) > 4) svnadmin dump --incremental -rNEXTREV:HEAD -M 1024 | svnadmin load > -M 1024 NEWRPOS > 5) Make OLDREPOS read-only or completely unavailable > 6) Possibly repeat 4) (if new commits happened after 4) > 7) Put NEWREPOS online > > At my company we recently did a dump+load from a old svn 1.5 > repository ("Repository format: 5; Filesystem Type: fsfs; Filesystem > Format: 3; Sharded but unpacked) to the brand new FSFS format 7 (new > in svn 1.9). It went fine with this procedure. Our largest repos was > 15 GB with 328000 revisions. We created the new repository on a > ramdisk, the loading was finished in 6 hours. After that we ran > 'svnadmin pack' (still on ramdisk), and then copied it to fixed disk. > > Some things to watch out for: > > - Dump+load does not preserve locks (in REPOS/db/locks), hooks (in > REPOS/hooks) and configuration files (in REPOS/conf). For those, a 'cp > -rp SOURCE TARGET' works well (but this is a good time to review your > hooks and conf files to make sure they are still fine). Make sure the > source repository is offline while you copy the locks, and those might > otherwise be changed in mid-flight. > > - You might run into: > > svnadmin: E125005: Invalid property value found in dumpstream; consider > repairing the > > source or using --bypass-prop-validation while loading. > > > > svnadmin: E125005: Cannot accept non-LF line endings in 'svn:log' > property > > You can try to repair this in the source repository (with svnadmin > setrevprop) -- I'll post separately about that -- or you might simply > ignore this minor corruption by using --bypass-prop-validation for > 'svnadmin load' (you can always repair this later in the new > repository). > > - You might run into: > > svnadmin: E125005: Invalid property value found in dumpstream; consider > repairing the source or using --bypass-prop-validation while loading. > > svnadmin: E125005: Cannot accept non-LF line endings in 'svn:ignore' > property > > This is more difficult to repair, because 'svn:ignore' is not a > revision property (like svn:log, which can be manipulated with > svnadmin setrevprop), but a *versioned property* (so it's part of > history). Again, you can ignore this with --bypass-prop-validation. > But since this is a corruption "in history", this can only be repaired > with a dump+load, so this might be a good time to try and fix this (or > you'll run into this again in the future). To repair it (which is what > we did), you can use svndumptool [2]. But it only works on dump > *files*, not as part of a pipe. So what we did was: dump that single > (corrupt) revision to a file, repaired it ('svndumptool.py eolfix-prop > svn:ignore svn.dump svn.dump.repaired'), loaded that single dumpfile, > and then continued with a new "piped" command (like step (4) above). > > > [1] http://svn.apache.org/viewvc?view=revision&revision=1736357 > [2] https://github.com/jwiegley/svndumptool > > -- > Johan > -- *DOUGLAS B. ROBINSON* SENIOR PRODUCT MANAGER *T *925-396-1125 *E* doug.robinson@wandisco.com *www.wandisco.com * -- Learn how WANdisco Fusion solves Hadoop data protection and scalability challenges Listed on the London Stock Exchange: WAND THIS MESSAGE AND ANY ATTACHMENTS ARE CONFIDENTIAL, PROPRIETARY, AND MAY BE PRIVILEGED. If this message was misdirected, WANdisco, Inc. and its subsidiaries, ("WANdisco") does not waive any confidentiality or privilege. If you are not the intended recipient, please notify us immediately and destroy the message without disclosing its contents to anyone. Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized. The views and opinions expressed in this e-mail message are the author's own and may not reflect the views and opinions of WANdisco, unless the author is authorized by WANdisco to express such views or opinions on its behalf. All email sent to or from this address is subject to electronic storage and review by WANdisco. Although WANdisco operates anti-virus programs, it does not accept responsibility for any damage whatsoever caused by viruses being passed. --001a113f71fec07c8e053568c4a0 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
One thing to remember: if revision properties are being ch= anged (i.e. the pre-revprop-change hook allows them) then those changes wil= l need to be captured and replayed into the loaded repository "somehow= " or changes to revisions prior to -rNEXTREV will be lost.

On Thu, Jun 16, 2016 at= 6:49 AM, Johan Corveleyn <jcorvel@gmail.com> wrote:
>> You should rather let = the computer spend some time dealing with the dump
>> and load process, instead of spending your own time working around= problems
>> and not even knowing if the result of your workarounds will be OK.=
>
> Will have to plan a maintenance window then.
> Gah. :)

You can do this with a very small maintenance window. The trick is t= o
dump+load to a new location, while the old repository is still
accessible (for checkouts and commits). After this is done (can take
hours, or even days, whatever) you note the last revision which was
loaded (or check the revision number with 'svnlook youngest
newrepos'), and start another dump+load where you dump with
'--incremental -rNEXTREV:HEAD' (where NEXTREV is the next revision<= br> that needs to be dumped).

You can iterate over this as long as you keep the old repository open
... At the end you make the original repository inaccessible for a
couple of minutes, while you enable the new one (Caveat: if you move
your new repos in the same disk location as the old one, and you use
Apache httpd to serve it, make sure you restart httpd to reset its
caches).

Hint: for a large repo I strongly suggest building the new repository
(the target of your 'svnadmin load') on very fast storage, even
ramdisk if possible (to copy it over to fixed disk afterwards). It's the 'svnadmin load' part that is very time-consuming right now (thi= s
will probably be much improved in svn 1.10, with the
--no-flush-to-disk option for 'svnadmin load' [1]).

To be precise, your commands might be:

1) svnadmin create NEWREPOS
=C2=A0 =C2=A0 (maybe create it on a ramdisk)
2) svnadmin dump -M 1024 OLDREPOS | svnadmin load -M 1024 NEWREPOS
=C2=A0 =C2=A0 (initial dump+load; you might want to pass -q to dump and/or = load
to make it more quiet)
=C2=A0 =C2=A0 (the -M 1024 gives the process 1024 MB extra ram for caching)=
3) svnlook youngest NEWREPOS
=C2=A0 =C2=A0 (last rev that was loaded -> NEXTREV is this last rev + 1)=
4) svnadmin dump --incremental -rNEXTREV:HEAD -M 1024 | svnadmin load
-M 1024 NEWRPOS
5) Make OLDREPOS read-only or completely unavailable
6) Possibly repeat 4) (if new commits happened after 4)
7) Put NEWREPOS online

At my company we recently did a dump+load from a old svn 1.5
repository ("Repository format: 5; Filesystem Type: fsfs; Filesystem Format: 3; Sharded but unpacked) to the brand new FSFS format 7 (new
in svn 1.9). It went fine with this procedure. Our largest repos was
15 GB with 328000 revisions. We created the new repository on a
ramdisk, the loading was finished in 6 hours. After that we ran
'svnadmin pack' (still on ramdisk), and then copied it to fixed dis= k.

Some things to watch out for:

- Dump+load does not preserve locks (in REPOS/db/locks), hooks (in
REPOS/hooks) and configuration files (in REPOS/conf). For those, a 'cp<= br> -rp SOURCE TARGET' works well (but this is a good time to review your hooks and conf files to make sure they are still fine). Make sure the
source repository is offline while you copy the locks, and those might
otherwise be changed in mid-flight.

- You might run into:
> svnadmin: E125005: Invalid property value found in dumpstream; conside= r repairing the
> source or using --bypass-prop-validation while loading.
>
> svnadmin: E125005: Cannot accept non-LF line endings in 'svn:log&#= 39; property

You can try to repair this in the source repository (with svnadmin
setrevprop) -- I'll post separately about that -- or you might simply ignore this minor corruption by using --bypass-prop-validation for
'svnadmin load' (you can always repair this later in the new
repository).

- You might run into:
> svnadmin: E125005: Invalid property value found in dumpstream; conside= r repairing the source or using --bypass-prop-validation while loading.
> svnadmin: E125005: Cannot accept non-LF line endings in 'svn:ignor= e' property

This is more difficult to repair, because 'svn:ignore' is not a
revision property (like svn:log, which can be manipulated with
svnadmin setrevprop), but a *versioned property* (so it's part of
history). Again, you can ignore this with --bypass-prop-validation.
But since this is a corruption "in history", this can only be rep= aired
with a dump+load, so this might be a good time to try and fix this (or
you'll run into this again in the future). To repair it (which is what<= br> we did), you can use svndumptool [2]. But it only works on dump
*files*, not as part of a pipe. So what we did was: dump that single
(corrupt) revision to a file, repaired it ('svndumptool.py eolfix-prop<= br> svn:ignore svn.dump svn.dump.repaired'), loaded that single dumpfile, and then continued with a new "piped" command (like step (4) abov= e).


[1] http://svn.apache.org/viewvc?= view=3Drevision&revision=3D1736357
[2] https://github.com/jwiegley/svndumptool

--
Johan



--
DOUGLAS B. ROBINSON=C2=A0SENIOR = PRODUCT MANAGER




Learn how WANdisco Fusion solves Hadoop data protection an= d scalability challenges

L= isted on the London Stock Exchange:=C2=A0WAND<= /p>

THIS MESSAGE AND ANY= ATTACHMENTS ARE CONFIDENTIAL, PROPRIETARY, AND MAY BE PRIVILEGED. =C2=A0If= this message was misdirected, WANdisco, Inc. and its subsidiaries, ("= WANdisco") does not waive any confidentiality or privilege. =C2=A0If y= ou are not the intended recipient, please notify us immediately and destroy= the message without disclosing its contents to anyone. =C2=A0Any distribut= ion, use or copying of this e-mail or the information it contains by other = than an intended recipient is unauthorized. =C2=A0The views and opinions ex= pressed in this e-mail message are the author's own and may not reflect= the views and opinions of WANdisco, unless the author is authorized by WAN= disco to express such views or opinions on its behalf. =C2=A0All email sent= to or from this address is subject to electronic storage and review by WAN= disco. =C2=A0Although WANdisco operates anti-virus programs, it does not ac= cept responsibility for any damage whatsoever caused by viruses being passe= d.

--001a113f71fec07c8e053568c4a0--