jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sam lee <skyn...@gmail.com>
Subject Re: repository synchronization
Date Tue, 03 May 2011 00:35:27 GMT
rsync does diff and compression.

What is jackrabbit equivalent of diff? Is there an efficient way of getting
the list of nodes that should be transported over?

On Mon, May 2, 2011 at 7:03 PM, Jürgen Baier <juerphen@googlemail.com>wrote:

> 2011/5/2 sam lee <skynare@gmail.com>:
> > Yah, it looks like the fastest way of migrating data is to transport the
> > entire repository filesystem.
> >
> http://wiki.apache.org/jackrabbit/BackupAndMigration#Low%20Level%20Backup
> >
> > But, it'd be nice if there's  a way to selectively migrate some path (of
> > repository).
> >
>
> That is also what I aim for...
>
> >
> > Do you know of data transport API? JCR doesn't seem to define any.
> > By transport API, I mean something like this:
> > "transport /content/foo/bar/*   from localhost:8080  to
> > saml.com:3040/content/foo/bar/copy/"
> >
> > Would you use RMI for this?
> >
>
> I do not currently know any transport API, I would do that (because of
> the infrastructure I use jackrabbit in) via EJBs. A naive approach
> could be iterating through the subtree_to_copy on the source machine
> and creating (via an EJB on the remote machine) the nodes with
> properties/versions/... on the target machine. I am sure you could do
> the same thing by accessing the remote repo via RMI. I used RMI-access
> some time ago and it was quite nice, but due to security concerns I
> deactivated the RMI servlet in my setup.
>
> Using a SyncFactory that returns either an RMI- or an
> EJB-transport-wrapper, this could nicely be solved so that once it is
> done (RMI and EJB) people can use what they want. I would be willing
> to do the EJB-stuff, and also help/work on the basic syncing as I
> consider that an important thing.
>
> I am aware that the EJB-thing is a "custom" wish by me, since
> jackrabbit comes with the RMI-access out-of-the-box, so the RMI-sync
> would be the default method.
>
> >
> > On Mon, May 2, 2011 at 8:21 AM, Jürgen Baier <juerphen@googlemail.com
> >wrote:
> >
> >> Hi,
> >>
> >> some time ago I tried something similar and used xml-export. This is
> >> not an option for non-trivial data, since the export/import is very,
> >> very slow (for your 500GB it would be much more than one day to export
> >> to xml, if I remember it correctly; was something in the range of
> >> hours/GB on my machine).
> >>
> >> What worked with me was using the filesystem-store and copying the
> >> whole repo-dir to the target machine. Still, I am interested in some
> >> sync-tool, because the ability to copy just a sub-tree of the whole
> >> repo would allow me to copy single users (their "home"-node and all
> >> nodes below that) to another machine. Since my jackrabbit-repos run as
> >> shared jee-resource I was thinking about a jee-solution, where I read
> >> the nodes on the inital machine and copy them to the target machine.
> >> But maybe I just miss a cool tool out there that already does this.
> >>
> >> Regards,
> >> Jürgen
> >>
> >>
> >> 2011/5/2 sam lee <skynare@gmail.com>:
> >> > Hey,
> >> >
> >> > I have a large repository. And, I have a few empty repositories.
> >> > How can I synchronize empty repositories with the content from the
> large
> >> > repository?
> >> >
> >> > Is there rsync like tool where subsequent synchronization (data
> >> migration)
> >> > is much quicker than initial pass?
> >> >
> >> > Is xml export/import the only option? Has anyone tried export/import
> on a
> >> > huge repository (500GB and growing)?
> >> >
> >> > Or, is there a way to rsync repository filesystem directory (not
> through
> >> JCR
> >> > but using the commandline tool)?
> >> >
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message