lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Boris Galitsky" <bg7...@rambler.ru>
Subject accelerate hits.id(i) function: eliminating scoring for the sake of efficiency
Date Thu, 11 May 2006 22:06:43 GMT
Yes, thanks Paul.

  We are already using
>  getSpans() on the top level SpanQuery, and use a loop
> calling next() on the Spans, and ignore duplicate doc() values from 
>the Spans
> in that loop.
> A counter in the loop would also give you the number of matching 
>occurrences
> of the SpanQuery.

I will look into
> NearSpansOrdered here  might be a bit faster than the NearSpans

However what significantly slows us down is the hits.id(i) function.
Can we accelerate it somehow "cleaning" Lucene code itself from 
scoring?

Best regards
Boris



> On Thursday 11 May 2006 22:42, Boris Galitsky wrote:
>> Hello
>> 
>>     We don't need any scoring in our application domain, but 
>> efficiency is the key because we are getting tens thousand of hits 
>>for 
>> span queries; all these hits are necessary to collect.
>>     Is there a simple way to turn scoring off while indexing, while 
>> search  and while delivering document IDs to save on time?
> 
> You could use getSpans() on the top level SpanQuery, and use a loop
> calling next() on the Spans, and ignore duplicate doc() values from 
>the Spans
> in that loop.
> A counter in the loop would also give you the number of matching 
>occurrences
> of the SpanQuery.
> 
> This way of using the Spans directly should be slightly more 
>efficient than
> using a HitCollector, but don't hold your breath.
> 
> In case you have ordered SpanQuery's without overlaps, the
> NearSpansOrdered here  might be a bit faster than the NearSpans
> currently in Lucene:
> http://issues.apache.org/jira/browse/LUCENE-413
> (you'll also need the patch to SpanNearQuery).
> 
> Regards,
> Paul Elschot
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-dev-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message