lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: SpanRegex speed
Date Fri, 01 Sep 2006 14:12:36 GMT
OK, a not very helpful answer, but "of course they're slower, they do more
work" (the span versions). But that's fairly useless, since the question is
really "is it enough slower in my situation that I need to find an
alternative?". And the only way I know of to answer that question is to make
some tests with the data representing my particular problem......

Sorry I can't be more help....
Erick

On 9/1/06, Mark Miller <markrmiller@gmail.com> wrote:
>
> Erick Erickson wrote:
> > Let me chime in here on a different note.... before you get happy with
> > wildcard queries, take a look at the thread "I just don't get
> > wildcards at
> > all". There is lots of good info that Erik, Chris and Otis provided me.
> >
> > The danger with prefixquery and wildcard query is that they will throw
> > TooManyClauses exceptions when you start matching a number of terms (the
> > default is 1024, although you can make this much bigger if memory
> > allows).
> > If you're aware of this and it is and will be OK in your app, ignore
> > this.
> > But if your index is going to grow significantly, this is a real
> > problem. I
> > went with implementing filters with WildCardTermEnum (you could also use
> > RegexTermEnum) for the wildcard portions of my query. Which has
> > interesting
> > implications for spans, we elected to say spans didn't work with
> > wildcards.
> >
> > Anyway, as I said, if you're aware of the TooManyClauses issue and are
> > sure
> > it doesn't matter, ignore me. After all, everybody else does <G>.....
> >
> >
> > Best
> > Erick
> >
> >
> >
> > On 8/30/06, Mark Miller <markrmiller@gmail.com> wrote:
> >>
> >> Ignore that last question. I see that you said prefix wildcard query
> and
> >> not wildcard query. A quick look at the code seems to show it grabbing
> a
> >> prefix as well.
> >>
> >> Do you think one would be any faster than the other? Should I used
> >> Wildcardqueries outside of spanqueries and the regexquery inside
> >> spanqueries or use regex both places?
> >>
> >> - Mark
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
> Thanks a lot for the info Eric. Good stuff to know for sure.
> I guess the real question I have been trying to spit out is this:
> Is a span version of any of these searches--fuzzy, wildcard,
> etc--inherently slower than their non-span brothers. If they have the
> same limitations and speeds then that is all I am looking for.
>
> P.S.
> I realize I have been screwing up the threading by replying when
> starting a new topic. I have been alerted and will stop this pernicious
> activity.
>
> - Mark
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message