lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <>
Subject Re: SpanRegex speed
Date Fri, 01 Sep 2006 11:59:00 GMT
Erick Erickson wrote:
> Let me chime in here on a different note.... before you get happy with
> wildcard queries, take a look at the thread "I just don't get 
> wildcards at
> all". There is lots of good info that Erik, Chris and Otis provided me.
> The danger with prefixquery and wildcard query is that they will throw
> TooManyClauses exceptions when you start matching a number of terms (the
> default is 1024, although you can make this much bigger if memory 
> allows).
> If you're aware of this and it is and will be OK in your app, ignore 
> this.
> But if your index is going to grow significantly, this is a real 
> problem. I
> went with implementing filters with WildCardTermEnum (you could also use
> RegexTermEnum) for the wildcard portions of my query. Which has 
> interesting
> implications for spans, we elected to say spans didn't work with 
> wildcards.
> Anyway, as I said, if you're aware of the TooManyClauses issue and are 
> sure
> it doesn't matter, ignore me. After all, everybody else does <G>.....
> Best
> Erick
> On 8/30/06, Mark Miller <> wrote:
>> Ignore that last question. I see that you said prefix wildcard query and
>> not wildcard query. A quick look at the code seems to show it grabbing a
>> prefix as well.
>> Do you think one would be any faster than the other? Should I used
>> Wildcardqueries outside of spanqueries and the regexquery inside
>> spanqueries or use regex both places?
>> - Mark
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
Thanks a lot for the info Eric. Good stuff to know for sure.
I guess the real question I have been trying to spit out is this:
Is a span version of any of these searches--fuzzy, wildcard, 
etc--inherently slower than their non-span brothers. If they have the 
same limitations and speeds then that is all I am looking for.

I realize I have been screwing up the threading by replying when 
starting a new topic. I have been alerted and will stop this pernicious 

- Mark

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message