lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: SpanRegex speed
Date Fri, 01 Sep 2006 11:59:00 GMT
Erick Erickson wrote:
> Let me chime in here on a different note.... before you get happy with
> wildcard queries, take a look at the thread "I just don't get 
> wildcards at
> all". There is lots of good info that Erik, Chris and Otis provided me.
>
> The danger with prefixquery and wildcard query is that they will throw
> TooManyClauses exceptions when you start matching a number of terms (the
> default is 1024, although you can make this much bigger if memory 
> allows).
> If you're aware of this and it is and will be OK in your app, ignore 
> this.
> But if your index is going to grow significantly, this is a real 
> problem. I
> went with implementing filters with WildCardTermEnum (you could also use
> RegexTermEnum) for the wildcard portions of my query. Which has 
> interesting
> implications for spans, we elected to say spans didn't work with 
> wildcards.
>
> Anyway, as I said, if you're aware of the TooManyClauses issue and are 
> sure
> it doesn't matter, ignore me. After all, everybody else does <G>.....
>
>
> Best
> Erick
>
>
>
> On 8/30/06, Mark Miller <markrmiller@gmail.com> wrote:
>>
>> Ignore that last question. I see that you said prefix wildcard query and
>> not wildcard query. A quick look at the code seems to show it grabbing a
>> prefix as well.
>>
>> Do you think one would be any faster than the other? Should I used
>> Wildcardqueries outside of spanqueries and the regexquery inside
>> spanqueries or use regex both places?
>>
>> - Mark
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
Thanks a lot for the info Eric. Good stuff to know for sure.
I guess the real question I have been trying to spit out is this:
Is a span version of any of these searches--fuzzy, wildcard, 
etc--inherently slower than their non-span brothers. If they have the 
same limitations and speeds then that is all I am looking for.

P.S.
I realize I have been screwing up the threading by replying when 
starting a new topic. I have been alerted and will stop this pernicious 
activity.

- Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message