lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: I just don't get wildcards at all.
Date Thu, 13 Apr 2006 15:22:07 GMT
More of the same....

So, now all I want to do is a SpanNearQuery on a wildcard. No problem if I
can use a wildcard query, but I can't because of the "TooManyClauses" issue.

We have two types of wildcards, truncation (which I can handle by indexing
successively shorter terms and a special character at the end. i.e. cat$ ca$
c$). This is neat, it gives me relevance, speed, all the virtues.

The second is a??b????c. Which I don't see any clever indexing schemes to
handle, but let me think about that more at lunch <G>.

Without asking you to write *too* much for me, what should I look at in
order to make this work with SpanNear? I'm guessing that I have to iterate
the wildcard terms, collect a filter for documents "close enough" and use
that filtered query (again thanks for all your help figuring this out) to
restrict the returned documents.

What I have in mind is enumerating over the first wildcard term. For each
successive term, ask if other terms are "close enough" and using that answer
to possibly turn off that document in the filter. Eventually I'll have a
filter that is what I want.

Now, this all seems expensive and complex. If the answer is "don't go there,
use clever indexing or don't try it with very many documents because it'll
take forever", then that in intself would save me a bunch of work.

Thanks again
Erick

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message