jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Parvulescu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (JCR-3513) Slower range query execution
Date Fri, 08 Feb 2013 08:55:15 GMT

     [ https://issues.apache.org/jira/browse/JCR-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alex Parvulescu updated JCR-3513:
---------------------------------

    Attachment: JCR-3513.patch

> We go with the second solution and removed the method org.apache.jackrabbit.core.query.lucene.RangeQuery.rewrite(IndexReader).

I'm going to propose a different solution. There is a way to force lucene to rewrite without
using a filter, you need to specify the #setRewriteMethod with 'CONSTANT_SCORE_BOOLEAN_QUERY_REWRITE'.
This way I think it would fallback to the previous behavior (no more filters).

This rewrite behavior affects all MultiTermQuery impls, and the RangeQuery seems to be the
last one that is still using  lucene's default.

I'm attaching a patch shortly (it is against trunk, but it should apply without problems on
the 2.4 code).

Tom, I'd appreciate it if you could give it a go in your setup :)
                
> Slower range query execution
> ----------------------------
>
>                 Key: JCR-3513
>                 URL: https://issues.apache.org/jira/browse/JCR-3513
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>    Affects Versions: 2.4.3
>            Reporter: Tom Quellenberg
>            Assignee: Alex Parvulescu
>         Attachments: JCR-3513.patch
>
>
> After switching from JachRabbit 1.6.4 to 2.4.3 we experienced extreme slow query executions.
All range query on date fields are often 10 times slow than before.
> In our repositories more than 1 million documents are stored which all contain for example
a creation date. Typical queries look like this:
> //element(*, sophora-nt:story)[@sophora:creationDate > ...]
> JackRabbit has its own RangeQuery implementation which is used when Lucene throws a TooManyBooleanClauses-exception
(and in some other situations, too). This worked well in Jackrabbit 1.6. In newer versions
a different Lucene library is used which never throws TooManyBooleanClauses exceptions. Instead,
is has its own fall-back in situations where a BooleanQuery does not work. This fall-back
with a MultiTermQueryWrapperFilter seams to us much slower than the fall-back implementation
in JackRabbit (Does anybody know the reason?). It is the same situation in Jackrabbit 2.6.0
(with Lucene 3.6.0)
> We patched org.apache.jackrabbit.core.query.lucene.RangeQuery to never use org.apache.lucene.search.TermRangeQuery
but always use the JackRabbit implementation. This leads to query executions as fast as in
older Jackrabbit versions.
> Do other people experience this problem? Are there any drawbacks using always the JackRabbit
implementation for range queries? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message