lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <>
Subject Re: I just don't get wildcards at all.
Date Fri, 14 Apr 2006 01:19:57 GMT
Wildcard stuff gets fun and interesting.  Have a look at  
SpanRegexQuery in the contrib/regex codebase.  It is a more  
generalized version of WildcardQuery, but within the SpanQuery family.


On Apr 13, 2006, at 11:22 AM, Erick Erickson wrote:

> More of the same....
> So, now all I want to do is a SpanNearQuery on a wildcard. No  
> problem if I
> can use a wildcard query, but I can't because of the  
> "TooManyClauses" issue.
> We have two types of wildcards, truncation (which I can handle by  
> indexing
> successively shorter terms and a special character at the end. i.e.  
> cat$ ca$
> c$). This is neat, it gives me relevance, speed, all the virtues.
> The second is a??b????c. Which I don't see any clever indexing  
> schemes to
> handle, but let me think about that more at lunch <G>.
> Without asking you to write *too* much for me, what should I look  
> at in
> order to make this work with SpanNear? I'm guessing that I have to  
> iterate
> the wildcard terms, collect a filter for documents "close enough"  
> and use
> that filtered query (again thanks for all your help figuring this  
> out) to
> restrict the returned documents.
> What I have in mind is enumerating over the first wildcard term.  
> For each
> successive term, ask if other terms are "close enough" and using  
> that answer
> to possibly turn off that document in the filter. Eventually I'll  
> have a
> filter that is what I want.
> Now, this all seems expensive and complex. If the answer is "don't  
> go there,
> use clever indexing or don't try it with very many documents  
> because it'll
> take forever", then that in intself would save me a bunch of work.
> Thanks again
> Erick

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message