jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Dürig <mdue...@apache.org>
Subject References, referenceables and referential integrity
Date Thu, 04 Apr 2013 11:34:40 GMT


I was looking into how to enforce referential integrity for 
referenceable nodes (https://issues.apache.org/jira/browse/OAK-685,

Currently references are implemented through an (unique) query index on 
the uuid property. Resolving references and finding references to a 
referenceable node thus involves doing a query. If we want to enforce 
referential integrity in this design, we'd need access to an up to date 
query index from within the respective commit hook. This could be either 
through a query engine or some other means to access the uuid index 

Instead of this we could however change the design such that no query 
index is needed to track references. In such a design referenced nodes 
would contain back references to all its referents. A commit hook could 
be employed to keep the back references up to date. Furthermore that 
commit hook could simply enforce referenceable integrity by checking 
whether the set of back references is empty on remove.

However, this design is not enough to ensure uniqueness of uuids and to 
look up nodes by uuid. For this we still need some kind of an index 
structure. So we could roll our own here or reuse query indexes. In the 
latter case the commit hook again needs access to the query index in 
order to do its job of updating back references.

In summary the options are:

a) Build our own ad-hoc index structure for uuid uniqueness and lookup. 
Use back references to find referring nodes and to enforce referential 

b) Use query indexes for uuid uniqueness and look up and for enforcing 
referential integrity in a commit hook and for finding referring nodes.

c) Use query indexes for uuid uniqueness and look up and for enforcing 
referential integrity in a commit hook. Use back references to find 
referring nodes. In this scenario the commit hook still needs access to 
the query index in order to be able to properly update the back references.

I'm not in favour of c) since it adds complexity from both worlds and I 
don't see much added value.

For b), it would be best if we had a way to access query indexes without 
having to go through an actual query.

Finally a) duplicates some of the indexing logic we have already for 
query indexes, but can do that in a way which is optimal for handling 

Implementation wise b) would be least effort and a) is probably the 
leanest, cleanest and meanest solution.



View raw message