jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ard Schrijvers <a.schrijv...@onehippo.com>
Subject Re: full text search improvements
Date Mon, 26 Mar 2012 15:05:53 GMT
On Mon, Mar 26, 2012 at 4:55 PM, Thomas Mueller <mueller@adobe.com> wrote:
> Hi,
>>I haven't looked at / tested JCR joins : I just can't imagine that is
>>scales enough, but perhaps this is more related to my 'Lucene 1.4
>>experience'  :)
> Lucene 1.4?

That's when I first used Lucene, don't worry :)  However note, *many*
of the current jackrabbit 2 search implementation designs still stem
from the short comings of the early Lucene 1.4 version! For example
that all properties are indexes in a single Lucene field, or that
there is a hierarchy of Lucene indexes (there was no 'reopen' of an
index reader back then)

> For Oak, joins should perform well (I guess with 'scale' you mean

I meant the joins in jackrabbit 2 : They are implemented in Lucene
afaik, and I cannot imagine those to perform very well for millions of
nodes. However, I did not test them so I might be wrong

For the current oak implementation, I cannot judge the performance of
joins at all. With scale I indeed mean performance, but then
specifically whether the performance scales.

> 'perform'). Currently only nested loop joins are implemented (this is what
> relational databases use most of the time). If this turns out to be a
> problem, we might want to implement other join algorithms (block-nested
> loop join, hash join, merge join). But first let's see if it really is a
> problem.
>>I am not sure if it would be an issue for oak, but for jr 1 and 2, we
>>build up jcr session keeping virtual node states in memory : This can
>>grow too large, and it not easy to limit.
> OK I see. With "virtual nodes" I was thinking about temporary nodes that
> only exist while iterating of the query result. But this is something I
> will keep in mind. I'm sure we will find a good solution.
>>but I think it is all much easier if we
>>expose faceting not over a node structure. Perhaps a row structure,
>>where some 'row' do not have a backing jcr node?
> It's hard to say right now, I think we should postpone talking about the
> implementation details until we have all the pieces and a good test case.

Yes, agreed

Regards Ard

> Regards,
> Thomas

Amsterdam - Oosteinde 11, 1017 WT Amsterdam
Boston - 1 Broadway, Cambridge, MA 02142

US +1 877 414 4776 (toll free)
Europe +31(0)20 522 4466

View raw message