lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <>
Subject [jira] Commented: (LUCENE-2760) optimize spanfirstquery, spanpositionrangequery
Date Sun, 14 Nov 2010 21:45:13 GMT


Robert Muir commented on LUCENE-2760:

Admittedly, I don't yet have a good benchmarking setup for these spanqueries yet.

But from doing a quick test on a 125k doc corpus, the SpanFirstQuery on a common term like
"the" took
about half the time.. this is because it read/evaluated 117,556 positions instead of 1,029,622

> optimize spanfirstquery, spanpositionrangequery
> -----------------------------------------------
>                 Key: LUCENE-2760
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>             Fix For: 3.1, 4.0
>         Attachments: LUCENE-2760.patch
> SpanFirstQuery and SpanPositionRangeQuery (SpanFirst is just a special case of this),
are currently inefficient.
> Take this worst case example: SpanFirstQuery("the").
> Currently the code reads all the positions for the term "the".
> But when enumerating spans, once we have passed the allowable range we should move on
to the next document (skipTo)

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message