lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Good and performance and fuzzy search
Date Wed, 10 Dec 2003 15:26:31 GMT
On Wednesday, December 10, 2003, at 04:07  PM, julien gerard wrote:
> I'm attempting to optimize a fuzzy search on a big index with 
> ~4.400.000 Documents ( lucene's meanning ) in 600.000 sub-categories 
> (Simple Text.Keyword type a field ).
>
> My purpose is to limit the amount of documents on wich the fuzzy 
> search with levenhstein disance is performed ( an user cannot search 
> on the 600.000 sub-categories but on 1 to 3 max )
>
> the classics lucenes ways to do that are not adapted to my case :
> - multiple indexes : having 600.000 indexes is a nightmare for 
> maintenance.
> - QueryFilter is not adapted because it's the fuzzy search which is in 
> The QueryFilter and the number of different request is too important, 
> so I cannot reuse the same.
> - The BooleanQuery with 'AND' parameter is also not adapted because 
> the two search are executed and after the results are merged.
>

QueryFilter would do the trick if you instead used the query you handed 
to it to be the one to single out a "sub-category".  It would limit the 
documents searched to just the sub-category, and the fuzzy search would 
be done using IndexSearcher.search, only handing it the filter then as 
well.

Will this scheme work for you?

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message