jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jukka Zitting <jukka.zitt...@gmail.com>
Subject Re: Online backup
Date Mon, 12 Mar 2012 10:50:30 GMT

On Sat, Mar 10, 2012 at 11:32 PM, Jörg Hoh <jhoh228@googlemail.com> wrote:
> We should have the possibility to create backup during normal operation of
> the repository, without shutting down the repository and without major
> impact to read or write performance. A true online backup.

With the Oak architecture as currently envisioned, there are at least
three alternative ways to achieve this:

1) The MVCC model gives us a stable snapshot of the repository state
at any given time, so a backup client should be able to export a
snapshot of the entire repository without interfering (except for the
extra IO overhead and potential cache impact) with normal repository

2) Assuming we get the clustering architecture right (which we
should), it should be possible to start a new read-only node to an
existing cluster, wait for it to synchronize all existing content from
the rest of the cluster, and finally stop this backup node. The result
should be a complete, runnable copy of the repository.

3) Since the Oak architecture builds on immutable data, most
persistence models will likely employ an append-only approach with
garbage-collection to clean up unused space. With little coordination
from the garbage collector, it should be possible to also get a stable
snapshot of the entire repository with native backup tools of the
underlying persistence mechanism.

> A bonus would be if this backup facility is additionally able to produce
> a diff to the latest backup (incremental backup).

I believe this should be doable with all the above approaches.


Jukka Zitting

View raw message