lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Braun <mbr...@uni-hd.de>
Subject Re: dash-words
Date Mon, 24 Jul 2006 07:54:03 GMT
Yonik Seeley schrieb:
> On 7/23/06, karl wettin <karl.wettin@gmail.com> wrote:
>> I'm want to filter words with a dash in them.
>>
>> ["x-men"]
>> ["xmen"]
>> ["x", "men"]
>>
>> All of above should be synonyms. The problem is ["x", "men"] requiring a
>> distance between the terms and thus also matching "x-men men".
> 
> WordDelimiterFilter from Solr does this:
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#head-1c9b83870ca7890cd73b193cefed83c283339089
> 

I can recommend this too. I use it and it works fine! I just do a
LowerCaseFilter afterwards to avoid the downside:
"if source text is "powershot" then a query of "PowerShot" won't match!"


> 
> It also has the false match problem you mention... "x xmen" would
> match a document with x-men, although this hasn't been a problem in
> practise.
> 
> -Yonik
> http://incubator.apache.org/solr Solr, the open-source Lucene search server
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


-- 
Universitaetsbibliothek Heidelberg   Tel: +49 6221 54-2580
Ploeck 107-109, D-69117 Heidelberg   Fax: +49 6221 54-2623

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message