lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Confusion over wildcard search logic
Date Tue, 23 Sep 2003 14:45:25 GMT
On Tuesday, September 23, 2003, at 10:09  AM, Dan Quaroni wrote:
> Yeah, thanks a lot for your help!  I'm using the release version of 
> Lucene
> version 1.2.

Perhaps give the latest codebase a try too, just to see if any fixes 
(particularly in that WildcardQuery.toString) are there.

>> you're getting hits on the wildcard match at least, and probably on
>> name field "amb" as well.  again, phrase queries don't support
>> wildcards like you've done here with "south san fran*" so you're not
>> matching anything with that.
>
> Ok... What's the correct procedure for doing a multi-word wildcard 
> where I
> want it to begin with "south san fran" but not get anything else that
> contains "south" or "san"?  Just and together the south, san, and fran?

As far as I know there isn't a way to do this with QueryParser 
currently.  The real way to do this with the existing API is to use 
PhrasePrefixQuery and do some manual setup before using it (like you'll 
see in the current test case and Javadocs for it) by enumerating all 
the terms that start with "fran" and passing that to a 
PhrasePrefixQuery (isn't this class misnamed?  What does this have to 
do with "prefix"?) along with "south" and "san".

> Although this might produce good results, my understanding was that 
> booleans
> retrieve all matches and store them in memory then resolve the 
> booleans.  If
> I use the term "san" to search California, I'm going to need a lot of 
> memory
> to store all of the temporary results...!

+south +san +fran* ought to do the trick.  i wouldn't worry about 
memory too much until you've seen it to be a problem.  i think you'll 
be fine (but don't currently have the understanding or data to back 
that up).

	Erik


Mime
View raw message