jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jukka Zitting" <jukka.zitt...@gmail.com>
Subject Re: JCR Import/Export gotchas
Date Fri, 09 Feb 2007 14:16:46 GMT

On 2/8/07, Shaun Barriball <sbarriba@yahoo.co.uk> wrote:
> Having read the spec relating to Session.import/export and
> Workspace.getImportContentHandler() I'd welcome comments on gotchas to
> avoid:
>  * using 'document' versus 'system' view. I presume for backup & restore use
> system view?

Correct. The document view doesn't support full roundtripping, i.e.
importing an exported document view XML stream does not necessarily
produce the same content structure. For example all multivalued
properties are currently dropped.

>  * "session imports" versus direct "workspace imports"

Session imports stores the imported content first in the in-memory
transient space before they get persisted by a separate save() call.
Thus you generally want to use a workspace import for large import

However, note that the current Jackrabbit internals keeps even the
workspace import within an internal update batch so memory
requirements grow linearly in both workspace and session imports.
Thus, you may want to consider using a custom importer tool that calls
save() every now and then to persist the content in smaller pieces.
This is however a bit troublesome if you use lots of references, as
the reference targets may not yet be available when an intermediate
save() is performed.

>  * the various UID resolution modes

IMPORT_UUID_COLLISION_REMOVE_EXISTING is the best mode to use when you
want to recreate the exact content structure that was exported before.

want to keep any restructurings that have been performed on the
workspace after the content was exported.

IMPORT_UUID_CREATE_NEW is good if you want to create a full copy of
the exported content tree without interfering with existing content in
the workspace.

IMPORT_UUID_COLLISION_THROW is the best alternative when migrating
content to a new workspace where the same content is not supposed to
already exist.


Jukka Zitting

View raw message