jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcel Reutegger <mreut...@adobe.com>
Subject workspaces & version storage
Date Wed, 05 Dec 2012 11:06:45 GMT

this is related to the discussion around how to implement JCR versioning
in Oak. in addition to the implementation details for JCR versioning we also
need to finalize how and whether Oak supports workspaces. so far the
Oak API simply allows you to specify a workspace name on login. I think
this is sufficient if we are just talking about workspaces, but is probably
not sufficient when a version storage comes into play.

if we keep it that way we basically require oak-core (and/or plugins) to
maintain a repository wide version storage tree and map it into a
workspace at /jcr:system/jcr:versionStorage. Alternatively we could
expose the version storage in a separate 'workspace' and oak-jcr
takes care of mapping it into the workspace hierarchy at the correct
location. the challenge here is how to update the version storage and
the workspace content in a single transaction. given this requirement
it seems to me the version-storage-as-a-workspace approach is not
viable. *unless* it is only exposed read-only and modifications to the
version storage are triggered through updates in the workspace.

now, what does that mean for the current oak-core implementation?
to make version operations transactional we need the workspace
Tree and the version storage Tree under the same Root. this is trivial
if we just store the version storage under /jcr:system inside the
workspace. however that limits us to just one single workspace and
makes it IMO quite difficult to implement multi-workspace support
later. the only alternative that truly keeps the door open to support
multiple workspace is to decouple *at least* access to the version
storage from a client perspective. what I mean is, if we now say the
version storage is accessible (possibly also writeable) in an Oak workspace
at a specific location, then we have to stick to that layout forever unless
we break backward compatibility. therefore I think it would be good to
already now define that the version storage is a separate workspace, though
internally an initial implementation wouldn't actually have multiple
workspace but just expose the jcr:system tree in another workspace (as
a temporary solution). oak-jcr will then use that workspace to read versions.
alternatively we could also implement the jcr:system workspace on top
of the Oak API by exposing jcr:system as the Root of a Tree hierarchy.

what we can do in a second step is to implement true multi workspace
support. IMO this means we will have to introduce an additional Tree
level for the workspace name. the hierarchy would look like this:

+ <root>
    + default
        + content
            + ...
    + jcr:system
        + jcr:versionStorage
        + ...

Loggin into a workspace will then only expose a sub tree and not the
entire content as accessible through the NodeStore. E.g. a login to
the 'default' workspace will provide:

+ Root (default) -> /
    + Tree (content) -> /content
        + Tree (...) -> /content/...

This has a number of implications. Validators, CommitHooks and friends
will see the entire repository, while they currently assume they only see a
default workspace. We'd have to change those implementations to take
the workspace name of the currently logged in session into account. This
actually extends to any code that operates on NodeStates or the NodeStore.
Most notably  this includes the QueryEngine and the QueryIndexProviders,
which are repository wide as well right now. An simple solution may be
to just wrap the NodeStore early with a WorkspaceNodeStore and only
expose the workspace NodeStates. However in some cases we do want
to have access to other workspaces. This includes cross workspace as well
as version operations in JCR. So, it might be better to always operate on
the repository-wide NodeStore and make the code workspace aware.
there's obviously a reason why we have the workspace parameter in the
login method, even though we don't use it at all right now.

so, bottom line...

my proposal is to have an initial JCR versioning implementation that stores
the version histories in the default workspace. the version storage is ideally
exposed as a separate (read-only) workspace [0] and oak-jcr accesses this
workspace to read the versions. Alternatively this could also be implemented
on top of the Oak API in oak-jcr. This approach allows us to introduce multiple
workspaces later, while access to the version storage from oak-jcr is still the


[0] we could actually generalize this and say the workspaceName parameter
on login may also identify a sub tree in a workspace. e.g. something like:
repo.login(credentials, "default:/foo/bar"). you'd then get a Root located
at /foo/bar.

View raw message