jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ard Schrijvers <a.schrijv...@onehippo.com>
Subject Re: Potential performance improvement?
Date Wed, 17 Feb 2010 11:54:48 GMT
On Tue, Feb 16, 2010 at 10:33 AM, Alexander Klimetschek
<aklimets@day.com> wrote:
> On Mon, Feb 15, 2010 at 13:28, Marcel Reutegger
> <marcel.reutegger@gmx.net> wrote:
>> On Fri, Feb 12, 2010 at 14:47, Alexander Klimetschek <aklimets@day.com> wrote:
>>> On Fri, Feb 12, 2010 at 13:33, Marcel Reutegger
>>> <marcel.reutegger@gmx.net> wrote:
>>>> jackrabbit does it in a similar way for quite some time now.
>>>
>>> To me it sounds like this partial-temporary-indexing feature should be
>>> part of Lucene directly (configurable, of course).
>>
>> well, it's not that easy. jackrabbit makes use of many assumptions and
>> implementation specific properties of the content that is indexed.
>> e.g. nodes are uniquely identifiable and it is not required to
>> immediately persist the index on commit. it is sufficient that a redo
>> log contains enough information to replay the changes. all this cannot
>> be moved easily into a more generic library like lucene. however there
>> is interesting work going on with the near-real-time index that we
>> might want to use in the future.
>
> I see. The near-real-time index sounds great (however, "real-time"
> always has to be taken carefully ;-)).

I scanned http://code.google.com/p/zoie/, and although not totally
clear from the documentation, I assume indeed that they have, as
Marcel points out, something similar to Jackrabbit's indexing
strategy, namely readonly multi index reader + one in memory index.
Afaik, it is also similar to [1], lucene Ocean Real Time Search.

As the current implementation in jr already has 'read only' indexes, I
doubt whether the gain of Lucene 2.9 will be that high. A good paper
on the changes by the way can be found here [2] (what is new in 2.9).
What I do think we can benefit on largely is triranges, as currently
range queries on for example dates are really expensive

Regards Ard

[1] http://wiki.apache.org/lucene-java/OceanRealtimeSearch
[2] http://www.lucidimagination.com/solutions/whitepapers

>
> Regards,
> Alex
>
> --
> Alexander Klimetschek
> alexander.klimetschek@day.com
>

Mime
View raw message