jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Müller <thomas.muel...@day.com>
Subject Re: Is there a way to store JackRabbit documents in two different datastores for one repository and yet index them with Lucene
Date Tue, 20 Jul 2010 08:22:20 GMT
Hi,

To store binaries, Jackrabbit uses the datastore API which is
(relatively) stable and here to stay. The plan is to support it in
Jackrabbit 3 as well. It's a relatively simple mechanism: you add a
binary and get a unique identifier. There is even a Jackrabbit API to
get the identifier: JackrabbitValue.getContentIdentity().

What you could do is:

- Implement a new "multiplex-datastore" that supports a number of
sub-datastores. It will read objects from any datastore that contains
the entry. When storing it needs to check if it's already stored, and
only if not it will store it in the first sub-datastore.

- Configure Jackrabbit to use this multiplex-datastore.

- Store the document within you application in the second datastore
(see below). The datastore supports concurrent writes, so that can be
in a different process if you want.

- You probably need to store (stream) the object in Jackrabbit as
well, but this will not create a new entry in the data store (the
existing one is re-used). In theory, to avoid streaming the object
again, you could create a JackrabbitValue yourself, but I didn't test
that. It would be a nice feature in Jackrabbit.

The solution would look like this:

MultiplexDataStore (<= used by Jackrabbit) links to
* FileDataStore @ /data/jackabbit/ (<= used by Jackrabbit to store new objects)
* FileDataStore @ /data/app/ (<= used by your application)

Regards,
Thomas

Mime
View raw message