lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <>
Subject [jira] Commented: (LUCENE-1951) wildcardquery rewrite improvements
Date Wed, 07 Oct 2009 16:48:35 GMT


Robert Muir commented on LUCENE-1951:

here are some stats for rewriting wildcards that should be prefix.
i query on a field with about ~10M numeric terms (a unique database id), average length 10
characters or so.
i copied this into ramdirectory to try to rule out i/o a bit (its only 1GB index and i use
4GB heap)

I look for all the ones starting with "1" (about 1.5 million of these). I did 3 runs, 100
queries each.
here are average times for each.


So, its not a big optimization, but seems consistent, and maybe more important if avg term
length is longer: in this case wildcard's comparison function might have to do even more work.

I'll work on a patch to fix the boost/constant score and include a prefixquery rewrite for
this case.

> wildcardquery rewrite improvements
> ----------------------------------
>                 Key: LUCENE-1951
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Query/Scoring
>            Reporter: Robert Muir
>            Priority: Minor
> wildcardquery has logic to rewrite to termquery if there is no wildcard character, but
> * it needs to pass along the boost if it does this
> * if the user asked for a 'constant score' rewriteMethod, it should rewrite to a constant
score query for consistency.
> additionally, if the query is really a prefixquery, it would be nice to rewrite to prefix
> both will enumerate the same number of terms, but prefixquery has a simpler comparison

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message