jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Mueller <muel...@adobe.com>
Subject Re: Custom index type
Date Tue, 18 Oct 2016 08:08:56 GMT

> "What's the best way to query for LARGE NUMBERS of key/value pairs?"

As I wrote, I'm not aware of a limit (on the number of values or the query
length) in Oak. But sure, if it is possible to avoid having a large query,
then it should be avoided, for simplicity.


On 14/10/16 18:28, "Clay Ferguson" <wclayf@gmail.com> wrote:

>The "Traversed 210000 nodes" warning is really telling you that it was
>unable to use your indexes to perform the search. (I think) This doesn't
>mean too many results were found, it just means you didn't create all the
>right indexes for the search. Just create an index for each property, and
>then search them the normal way (without the LIKE clause, but using '=')
>and I bet you will see good performance.  If you genuinely have thousands
>of key/value pairs to search it is possible that your full-text approach
>the best performing solution, but I'm not sure.
>However your general question is: "What's the best way to query for LARGE
>NUMBERS of key/value pairs?"
>Maybe some experts who know more than me about Oak can reply to that
>simplified version of your question.
>Best regards,
>Clay Ferguson
>On Fri, Oct 14, 2016 at 10:29 AM, rachna <rachana.mehta@telegraph.co.uk>
>> Thanks Clay & Thomas.
>> Taking a step back from our problem has helped to look at it in a
>> way.
>> The tag property also stores the values in a specific format that show
>> tree structure.
>> cq:tags
>> - location:europe
>> - type:waterfalls
>> Therefore instead of traversing the repository to identify the
>> of these tags, we could use a LIKE query.
>> [/content/guides]) AND ([cq:tags] LIKE 'location:europe%' OR [cq:tags]
>> 'type:waterfalls%') ORDER BY [cq:lastModified]
>> However, since our repository contains a large number of items that
>> this criteria, we start to see warnings about traversing the index.
>> org.apache.jackrabbit.oak.plugins.index.property.strategy.
>> ContentMirrorStoreStrategy
>> Traversed 210000 nodes (210164 index entries) using index
>> with filter Filter(query=SELECT * FROM [cq:PageContent] AS b WHERE
>> ISDESCENDANTNODE(b, [/content/guides]) AND ([cq:tags] LIKE
>> 'location:europe%' OR [cq:tags] LIKE 'type:waterfalls%') ORDER BY
>> [cq:lastModified], path=/content/guides//*, property=[cq:tags=[is not
>> null]])
>> Instead, I created a lucene index that indexes the cq:tags (/w full
>> and cq:lastModified (/w ordered support) property.
>> e.g. SELECT [jcr:path] FROM [cq:PageContent] AS b WHERE
>> [/content/guides]) AND (CONTAINS([cq:tags], 'location:europe') OR
>> CONTAINS([cq:tags], 'type:waterfalls')) ORDER BY [cq:lastModified]
>> That seems to be much faster than using a property index and should
>> most of the issues that we might have (hopefully avoiding creating a new
>> index).
>> Is there any support with the lucene index to use something like
>> rather CONTAINS?
>> The maxClauseCount configuration parameter introduced the soft limit of
>> 1024
>> which is part of Jackrabbit 2.
>> We have been attempting to move to oak however our progress has been
>> due to repository inconsistencies.
>> I realise this value is configurable however constantly increasing it
>> doesn't sound the right thing to do.
>> Thanks,
>> Rachna
>> --
>> View this message in context: http://jackrabbit.510166.n4.
>> nabble.com/Custom-index-type-tp4665031p4665121.html
>> Sent from the Jackrabbit - Users mailing list archive at Nabble.com.

View raw message