jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tommaso Teofili <teof...@adobe.com>
Subject Re: commit hooks and indexing
Date Fri, 15 Feb 2013 09:20:00 GMT

On 15/feb/2013, at 08:11, Jukka Zitting wrote:

> Hi,
> 
> On Thu, Feb 14, 2013 at 10:29 PM, Michael Dürig <mduerig@apache.org> wrote:
>> However, there is a difference depending on whether the index is stored in
>> content or external. For the former case using commit hooks is the right
>> thing to do. In the case of a failed commits nothing is written at all, not
>> even the index data. Using and observer here still works, but would leave
>> the index lagging behind for the time the commit actually succeeded until
>> the observer is finally called.
> 
> I think both options are valid for an in-content index, the basic
> trade-off here is between commit speed and conflict handling on the
> other hand and instant availability of index updates on the other.
> 
> A hook-based index is by definition always up to date with latest
> content, and is thus useful especially for things like UUID tables and
> other internal indices that need to be kept up to date at all times.
> 
> However, hooks add overhead to each individual commit and will either
> need automatic conflict resolution or synchronous execution to avoid
> index corruption in cases of concurrent commits. That makes them
> non-ideal for many of the more complex types of indices.

maybe it would be nice to explicitly design the synchronous / asynchronous execution of an
hook in the APIs so that callers can run index updates accordingly (e.g. in threads for async
calls), however I'm not sure this would match with Oak concurrency model.

> 
> Luckily most of the potential complex indices don't need to be up to
> date at all times, and thus can well be updated via an observer even
> if the index content is stored in the repository. In such cases the
> observer treats the repository like any other external index storage
> (i.e. it's not updated through the Observer interface like how hooks
> work), and would just need to make sure to ignore the content updates
> it itself makes.
> 
>> For externally stored indexes I think we need to live with the lag in favour
>> of having a consistent index.
> 
> Right; without implementing full distributed transaction support (and
> the associated concurrency overhead) it's impossible to keep an
> external index in sync with the repository at all times.

I agree that would probably be too much (and not worth/performant) effort.
My 2 cents,
Tommaso

> 
> BR,
> 
> Jukka Zitting


Mime
View raw message