lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Pasero <benjamin.pas...@gmail.com>
Subject Re: What would be the fastest BooleanQuery possible?
Date Wed, 16 Sep 2009 14:50:11 GMT
Thanks, I tried this but profiling showed me that I get similar
results. Most time is spent in
- BooleanQuery.createWeight()
- BooleanScorer.next()

If I am not interested in scores, do I still need the heavy weight computation?

On Wed, Sep 16, 2009 at 4:16 PM, Michael McCandless
<lucene@mikemccandless.com> wrote:
> You could get the Scorer and call next() yourself; this would avoid
> scoring.  EG something like this:
>
>      Weight weight = query.weight(searcher);
>      Scorer scorer = weight.scorer(searcher.getIndexReader());
>      while(scorer.next()) {
>        final int docID = scorer.doc();
>        /* do something w/ docID */
>      }
>
> But note that this code is not generally recommended in 2.9 (since
> it's not operating at the segment level).
>
> If your queries contain only SHOULD and up to 32 MUST_NOT clauses,
> then calling BooleanQuery.setAllowDocsOutOfOrder should improve
> performance since internally it will use BooleanScorer instead of
> BooleanScorer2.
>
> Mike
>
> On Wed, Sep 16, 2009 at 9:14 AM, Benjamin Pasero
> <benjamin.pasero@gmail.com> wrote:
>> Ah wow that sounds great. I am using 2.3.2 though (and have to use it
>> for now). Anything
>> in that version that could speed things up?
>>
>> On Wed, Sep 16, 2009 at 6:48 PM, Mark Miller <markrmiller@gmail.com> wrote:
>>> With the new Collector API in Lucene 2.9, you no longer have to compute the
>>> score.
>>>
>>> Now a Collector is passed a Scorer if they want to use it, but you can
>>> just ignore it.
>>>
>>> --
>>> - Mark
>>>
>>> http://www.lucidimagination.com
>>>
>>>
>>>
>>> Benjamin Pasero wrote:
>>>> Hi,
>>>>
>>>> I am using Lucene not only for smart fulltext searches but also for
>>>> getting the results for a DB-like query, where I am not tokenizing the
>>>> terms at all. For this query, I am interested in all results and for
>>>> that
>>>> I am using my own HitCollector.
>>>>
>>>> Now, while profiling I noticed that quite some time is spent in
>>>> methods like TermQuery.weight() or BooleanScorer2.score(). Given that
>>>> I am interested in all results, I am not interested in any score for
>>>> the
>>>> results.
>>>>
>>>> Is it possible to run a query where Lucene simply checks if a Document
>>>> is a hit or not and completly ignore weighting and scoring? Or is that
>>>> an integrated part of the search used to determine if a Document
>>>> is a hit or not?
>>>>
>>>> Thanks for helping,
>>>> Ben
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message