jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jukka Zitting <jukka.zitt...@gmail.com>
Subject Re: commit hooks and indexing
Date Fri, 15 Feb 2013 07:11:50 GMT

On Thu, Feb 14, 2013 at 10:29 PM, Michael Dürig <mduerig@apache.org> wrote:
> However, there is a difference depending on whether the index is stored in
> content or external. For the former case using commit hooks is the right
> thing to do. In the case of a failed commits nothing is written at all, not
> even the index data. Using and observer here still works, but would leave
> the index lagging behind for the time the commit actually succeeded until
> the observer is finally called.

I think both options are valid for an in-content index, the basic
trade-off here is between commit speed and conflict handling on the
other hand and instant availability of index updates on the other.

A hook-based index is by definition always up to date with latest
content, and is thus useful especially for things like UUID tables and
other internal indices that need to be kept up to date at all times.

However, hooks add overhead to each individual commit and will either
need automatic conflict resolution or synchronous execution to avoid
index corruption in cases of concurrent commits. That makes them
non-ideal for many of the more complex types of indices.

Luckily most of the potential complex indices don't need to be up to
date at all times, and thus can well be updated via an observer even
if the index content is stored in the repository. In such cases the
observer treats the repository like any other external index storage
(i.e. it's not updated through the Observer interface like how hooks
work), and would just need to make sure to ignore the content updates
it itself makes.

> For externally stored indexes I think we need to live with the lag in favour
> of having a consistent index.

Right; without implementing full distributed transaction support (and
the associated concurrency overhead) it's impossible to keep an
external index in sync with the repository at all times.


Jukka Zitting

View raw message