incubator-ooo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Weir <robw...@apache.org>
Subject [Repo][Proposal] OOO340 SVN Dump file import
Date Mon, 15 Aug 2011 16:46:25 GMT
We've been discussing for two months now how to get Hg over to SVN.
There have been several suggestions for how the CWS's and complete
revision history could be migrated over, but little progress has been
made.  Either the proposals didn't work, or no volunteers stepped
forward to implement them.

The alternative proposal was to just check in the tip of the trunk,
without history, and then migrate Hg to Apache-Extras.org, where Hg is
supported.  I've made some progress on this proposal.

Here's what I did.  I'd like some review, to make sure I didn't screw
anything up. I am neither an Hg nor a SVN expert.  But I do have a big
harddrive.

I used Subversion command-line client, version
1.6.17-SlikSvn-tag-1.6.17@1130896-WIN32.

I first brought down OOo, both the trunk and the language stuff, into
separate directories:

hg clone http://hg.services.openoffice.org/OOO340
hg clone http://hg.services.openoffice.org/master_l10n/OOO340/

I then moved these into a common directory structure, as Ingrid had
earlier suggested:

ooo/trunk/core --- all the OOO340 stuff
ooo/trunk/l10n -- all the language stuff

I removed the .Hg directories before proceeding, so I had a clean local copy.

I then created a local SVN repository, enabled auto-props to get the
proper EOL treatment and imported the project:

svn import c:\merged file:///c:/svn-repo/ -m "initial import"

During local svn import I received error messages:

"svn: Inconsistent line ending style"

This typically indicated that a text file had a mix of EOL styles
(DOS/UNIX).  But I found some cases where this was not true, but where
the problem appeared to be related to an unsupported encoding.  For
example, SVN does not seem to support UTF-16 encodings.

I received this "Inconsistent line ending style" on the following files:

ooo/trunk/core/dictionaries/de_DE/README_hyph_de_DE.txt
ooo/trunk/core/dictionaries/de_CH/README_hyph_de_CH.txt
ooo/trunk/core/dictionaries/de_AT/README_hyph_de_AT.txt
ooo/trunk/core/gettext/gettext-0.18.1.1.patch
ooo/trunk/core/apache-commons/patches/codec.patch
ooo/trunk/core/libcroco/libcrco-0.6.2.patch
ooo/trunk/core/testautomation/writer/optional/input/import/mactext.txt
ooo/trunk/core/graphite/graphite-2.3.1.patch
ooo/trunk/core/hwpfilter/source/hwpeq.cpp (some weird non-ascii text
in file, should review)
ooo/trunk/core/solenv/bin/cwstouched.pl (should review)
ooo/trunk/core/readlicense_oo/html/THIRDPARTYLICENSEREAMDE.html
ooo/trunk/core/writerfilter/source/doctok/escher.html
ooo/trunk/core/writerfilter/source/odiapi/qname/resource/office2003/WordprocessingML
Schemas/xsdlib.xsd (convert from UTF-16 to UTF-8)
ooo/trunk/core/filter/source/xslt/odf2xhtml/export/common/body.xsl
ooo/trunk/core/filter/source/xslt/odf2xhtml/export/common/styles/style_mapping_css.xsl
ooo/trunk/core/filter/source/xslt/odf2xhtml/export/common/styles/style_collector_css.xsl
ooo/trunk/core/filter/source/xslt/odf2xhtml/export/common/styles/table/table.xsl
ooo/trunk/core/filter/source/xslt/odf2xhtml/export/common/styles/table/table_cells.xsl
ooo/trunk/core/filter/source/xslt/odf2xhtml/export/common/styles/table/table_columns.xsl
ooo/trunk/core/filter/source/xslt/odf2xhtml/export/common/styles/table/table_rows.xsl

In each case, the error aborted the import which had to then be
restarted from the top.  So it was a slow process, finding all of
these problem files.  Possible solutions could include adding them as
binary (not text) files, or editing them (dos2unix, e.g.) to make
their EOL style consistent.  I did the latter.

Note:  any other approach to migrating Hg to SVN will run into the
above problem files, so I'd recommend that anyone who wants to try an
alternative migration approach start by fixing the above files.

Once the project was imported, I did an svn export to get a clean copy
of the project, and compare it to the original directory.  The file
counts matched, which is a good sign:  69202 files.

I then did an svnadmin -c dump >ooo-dump to create a dump file of this
repository.

The dump file is 1.8 GB, with an MD5 hash of:  fd611942d297128d021cd03795b54708

It compressed to a 367 MB gzip which I've put on my website here:

http://www.robweir.com/ooo-dump.gz

So unless anyone has a better idea, and more importantly, is willing
to implement a better idea, I'd like to go forward with importing this
dump file.

Let's take a few days to review the steps above, and to review the
dump file, to make sure there are not any major errors introduced.  If
someone can kick off a build with this source, it would be a great way
to confirm.

I have all of my partial steps saved, so making small tweaks to this
are relatively easy.  For example, if there are some file extensions
used by OOo that should be treated as text, but are not listed in the
standard SVN config, or in the recommended Apache project extended
list, this is a good time to get those corrected.

Regards,

-Rob

Mime
View raw message