lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley" <ysee...@gmail.com>
Subject Re: dash-words
Date Mon, 24 Jul 2006 04:34:09 GMT
On 7/23/06, karl wettin <karl.wettin@gmail.com> wrote:
> I'm want to filter words with a dash in them.
>
> ["x-men"]
> ["xmen"]
> ["x", "men"]
>
> All of above should be synonyms. The problem is ["x", "men"] requiring a
> distance between the terms and thus also matching "x-men men".

WordDelimiterFilter from Solr does this:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#head-1c9b83870ca7890cd73b193cefed83c283339089

It also has the false match problem you mention... "x xmen" would
match a document with x-men, although this hasn't been a problem in
practise.

-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message