jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robin Wyles <ro...@jacaranda.co.uk>
Subject Re: Problems migrating from 1.6.0 to 2.1.0
Date Wed, 01 Sep 2010 14:15:00 GMT
Hi Stefan,

On 1 Sep 2010, at 10:44, Stefan Guggisberg wrote:

> On Wed, Sep 1, 2010 at 11:38 AM, Robin Wyles <robin@jacaranda.co.uk> wrote:
>> An update on this...
>> 
>> Stefan was indeed correct and it a charset/encoding issue that was causing Jackrabbit
to ignore the existing repository content.
> 
> thanks for the information. can you please provide more details about
> the exact nature of the problem?
> 

Sure, it seems that mysqldump has a habit of corrupting charsets other than latin1. We forced
the use of latin1 using following commands to export/import our repository data:

mysqldump -u username -p --default-character-set=latin1 -N database > backup.sql
mysql -u username -p --default-character-set=latin1 database < backup.sql

There's more info here:

http://docforge.com/wiki/Mysqldump

Even though this appears to work we're still unable to see any nt:file nodes whose binary
data is stored in the datastore, I'm not sure whether this is a related or separate issue...

Robin

> cheers
> stefan
> 
>> 
>> However, now that I have manage to get our existing repository running under 2.1.0
I have a new problem and that is that all the nt:file nodes whose content is stored in the
datastore (FileDataStore) are missing. The small nt:file nodes that are stored in the database
are visible, just not those in the FileDataStore.
>> 
>> When starting up our newly migrated repository for the first time I get a few "Record
not found" datastore exceptions and some associated Tika exceptions for those missing datastore
records - would those errors prevent the entire datastore from being used? The number of errors
are far less than the 3000 or so items in the datastore, so it would suggest that it's either
ignoring most of the datastore contents, or at start up at least they are recognised as valid.
>> 
>> As before, once our repository has started I am able to add new nodes to the datastore,
and these behave has expected.
>> 
>> Any help, gratefully received - I'm really keen to get our repos onto 2.10 as some
of its new query functionality is much needed!
>> 
>> Robin
>> 
>> 
>> 
>> 
>> On 27 Aug 2010, at 16:03, Robin Wyles wrote:
>> 
>>> Hi Stefan
>>> 
>>> On 27 Aug 2010, at 13:11, Stefan Guggisberg wrote:
>>> 
>>>> On Fri, Aug 27, 2010 at 2:02 PM, Stefan Guggisberg
>>>> <stefan.guggisberg@day.com> wrote:
>>>>> On Fri, Aug 27, 2010 at 1:18 PM, Robin Wyles <robin@jacaranda.co.uk>
wrote:
>>>>>> Hi Stefan,
>>>>>> 
>>>>>> Thanks for your quick reply...
>>>>>> 
>>>>>> On 27 Aug 2010, at 11:36, Stefan Guggisberg wrote:
>>>>>> 
>>>>>>> hi robin,
>>>>>>> 
>>>>>>> On Fri, Aug 27, 2010 at 11:25 AM, Robin Wyles <robin@jacaranda.co.uk>
wrote:
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> I'm having problems migrating an existing repository from
Jackrabbit 1.6.0 to 2.1.0.
>>>>>>>> 
>>>>>>>> Here are the steps I followed to test the migration:
>>>>>>>> 
>>>>>>>> 1. Update app to use Jackrabbit 2.1.0, run unit tests etc.
Manually test against empty 2.1.0 repository. All works fine here. Our repository configuration
has not changed at all between versions.
>>>>>>>> 
>>>>>>>> 2. Used mysqldump to export production repository.
>>>>>>>> 
>>>>>>>> 3. Copy production repository directory (workspace folder,
datastore, index folders etc.) to test machine.
>>>>>>>> 
>>>>>>>> 4. Import SQL file from 2 above to new DB on test machine.
>>>>>>>> 
>>>>>>>> 5. Start application on test machine.
>>>>>>>> 
>>>>>>>> The result of the above is that the application starts up
without error but that the repository appears empty. I am able to add new nodes to the repository,
which behave correctly within the application yet none of the existing nodes are visible.
I've tried xpath queries against known paths, e.g. "//library/*" and these return 0 nodes.
>>>>>>>> 
>>>>>>>> A few things I've tried/noticed:
>>>>>>>> 
>>>>>>>> 1. Repeating steps 3 and 4 above, then removing the old index
directories before starting the application. Jackrabbit creates new lucene indexes, but they
are very small, just like they would be when initialising an empty repository. Also, the index
files are called indexes_2 rather than indexes as they were under 1.6.0.
>>>>>>>> 
>>>>>>>> 2. When starting the app after the migration I notice that
4 extra records have been added to the BUNDLE table, 3 extra records are added to the VERSION_BUNDLE
table and 2 extra records added to the VERSION_NAMES table. Again, this seems to be consistent
with what is added automatically added to the database when a new repository is initialised.
>>>>>>>> 
>>>>>>>> So, basically it appears that Jackrabbit is completely ignoring
the existing repository data, and instead initialising a new repos using the existing database…
>>>>>>>> 
>>>>>>>> If anyone has any ideas as to how I can get 2.1.0 to recognise
our existing repository they'd be gratefully received - I feel there must be something simple
I've overlooked!
>>>>>>> 
>>>>>>> hmm, seems like the key values (i.e. the id format) has changed.
>>>>>>> however, i am not aware of such a change.
>>>>>>> maybe someone else knows more?
>>>>>> 
>>>>>> The release notes for Jackrabbit 2.0.0 claim that it is backward
compatible with 1.x repositories. I've seen a couple of messages on the users list relating
to migration issues but these seem to involve custom nodetypes, whereas our repository has
no custom nodetypes.
>>>>>> 
>>>>>> How may I see what key values/ID format is used by the different
versions? This sounds like quite a major change to me, and I'm sure  something that would've
been documented!
>>>>> 
>>>>> absolutely. however, if you're saying that 4 extra records have been
>>>>> inserted into the BUNDLE table
>>>>> and the BUNDLE table already had n>=4 records, i can only explain
it
>>>>> with a changed binary representation
>>>>> of the record id's.
>>>>> 
>>>>> the 4 BUNDLE records are:
>>>>> 
>>>>> / (root node)
>>>>> /jcr:system
>>>>> /jcr:system/jcr:nodeTypes
>>>>> /jcr:system/jcr:versionStore
>>>>> 
>>>>> the values of the ids those nodes are hard-coded in jackrabbit.
>>>>> on startup, those nodes will be created if they don't exist.
>>>>> 
>>>>> i am not a mysql expert. have you compared the configurations
>>>>> of both mysql instances? maybe it's some strange charset/encoding
>>>>> issue...
>>> 
>>> Both mysql instances use the same charset/encoding, and all tables on both instances
are set to utf-8 for encoding and collation.
>>> 
>>> The only difference between the two mysql instances are their version - slightly
older on our production machine.
>>> 
>>> However, what you say makes sense - it really does look like Jackrabbit can't
find those nodes on start up which implies there's a charset/encoding issue.
>>> 
>>> I'm going to see if I can duplicate the database on our production mysql instance
and test against that...
>>> 
>>>> 
>>>> or maybe it's a problem with the mysql indexes on those tables...
>>>> 
>>> 
>>> I tried deleting the mysql indexes and recreating them, it didn't seem to make
any difference.
>>> 
>>> Thanks,
>>> 
>>> Robin
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> 



Mime
View raw message