jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ard Schrijvers <a.schrijv...@onehippo.com>
Subject Re: [jr3 optional features]
Date Thu, 23 Feb 2012 11:27:11 GMT
Hello,

On Thu, Feb 23, 2012 at 11:59 AM, Thomas Mueller <mueller@adobe.com> wrote:
> Hi,
>
>>a mismatch between hierarchical structured
>>data and flat indexing through Lucene (assuming Lucene as index).
>
> My idea is to use Lucene for for fulltext indexing (of documents), but
> nothing else.
>
> The rest (property index, node type index, path index) is indexed in a
> different way, more like in databases: indexes are user defined and only
> apply to a certain subset of the data (filter by path prefix, node type,
> property name, value type,...). The index data is stored in the repository
> itself (or possibly in a separate 'index' repository, accessed using the
> MicroKernel API).
>
> Based on the query, the query engine would then chose the best index(es)
> to use.
>
>>If there would be a jr3 core and an
>>optional index, that would make sense to me.
>
> Yes, that's the plan.

Thank you for updating me Thomas, much appreciated.

Although I do follow your suggestions above, they raise so many
questions/problems in my head, which I just can't imagine to be ever
solved (or I am just way narrow minded :-) .

What about combining free text searches with other constraints? I
don't see how you can ever in a performing way overlay (let alone
score) the results from a Lucene fulltext index with for example the
results from 'database indexes'.  Also, it seems impossible to me to
combine hierarchical constraints with fulltext searching if the
hierarchy is not part of the Lucene index. For example:

/jcr:root/content[@jcr:contains(....)]/documents[@jcr:contains(...)]/news/2001

Either way, it just feels very complex to me what you need. Unless you
are ok that fulltext searches & 'database like queries' cannot be
combined. And still, I keep having doubts about a generic jcr fulltext
index: I think jcr nodes are too low lever for fulltext indexing. You
already imply fulltext indexing (of documents). That is what
developers want. But, a document is domain specific. In our case, it
is a tree of jcr nodes, which also contain links to other nodes (which
you might want to include some text property from in your document
index).

Regards Ard

>
> Regards,
> Thomas
>

Mime
View raw message