lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walter Underwood <wun...@wunderwood.org>
Subject Re: A field-wide remove duplicate tokens filter
Date Wed, 17 Dec 2014 22:43:54 GMT
Why is that useful? It breaks phrase search.

If you want to ignore term frequency in ranking, change the Similarity class.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/


On Dec 17, 2014, at 2:40 PM, Varun Rajput <varun_sf@hotmail.com> wrote:

> The org.apache.solr.analysis.RemoveDuplicatesTokenFilter, as per its description, "Filters
out any tokens which are at the same logical position in the tokenstream as a previous token
with the same text."
> A very useful filter would be one which filters out duplicate tokens throughout the field,
irrespective of the logical position of the token. Does something like this exist already
or is being planned to be included in the coming releases?
> I have an implementation of this in one of my project and can contribute if the community
finds it useful as well.
> Best,Varun 		 	   		  


Mime
View raw message