lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Volodymyr Bychkoviak <>
Subject Re: Question for Wildcard Search:
Date Thu, 23 Jun 2005 10:27:12 GMT
about 3 months ago I posted some idea about wildcard searching.

main idea was to index every character of input as separate term. and 
then search using PhraseQuery.
for example word "12345" would be indexed as "1" "2" "3" "4" "5". to 
find "*23*"  you can use PhraseQuery with this two terms ("2" "3"). But 
this approach is limited only to queries with wildcards in the begin or end.

Later I did some research and wrote Extension to PhraseQuery that allows 
to set term relative position to range of values (to insert gaps for "*" 
and "?") this approach is good because it does not rewrite queries and 
never run into OutOfMemory or TooManyClauses Exceptions

Volodymyr Bychkoviak

14.03.2005 13:54

Dave Kor wrote:

>Quoting Dave Kor <>:
>>Quoting Erik Hatcher <>:
>>>Anyone tried this technique with Lucene?
>>Actually, the problem is that the wildcard code has to search over a large
>>subset of terms because the list of terms is, well, a linear structure.
>>If, for example, all terms in the index is arranged as a suffix tree, the
>>of wildcard search that currently is cpu intensive will no longer be cpu
>Hmm I realized I should add a qualifier to the above statement. Searching for
>matching terms would no longer be cpu intensive, especially for wildcards like
>*foo* or *foo. The other wildcard search problem of having too many matching
>terms to lookup in the index still remains unsolved.
>To unsubscribe, e-mail:
>For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message