lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emir Arnautovic <emir.arnauto...@sematext.com>
Subject Re: Removing duplicate terms from query
Date Thu, 09 Feb 2017 12:52:21 GMT
Hi Ere,

I don't think that there is such filter. Implementing such filter would 
require looking backward which violates streaming approach of token 
filters and unpredictable memory usage.

I would do it as part of query preprocessor and not necessarily as part 
of Solr.

HTH,
Emir


On 09.02.2017 12:24, Ere Maijala wrote:
> Hi,
>
> I just noticed that while we use RemoveDuplicatesTokenFilter during 
> query time, it will consider term positions and not really do anything 
> e.g. if query is 'term term term'. As far as I can see the term 
> positions make no difference in a simple non-phrase search. Is there a 
> built-in way to deal with this? I know I can write a filter to do 
> this, but I feel like this would be something quite basic to do for 
> the query. And I don't think it's even anything too weird for normal 
> users to do. Just consider e.g. searching for music by title:
>
> Hey, hey, hey ; Shivers of pleasure
>
> I also verified that at least according to debugQuery=true and 
> anecdotal evicende the search really slows down if you repeat the same 
> term enough.
>
> --Ere

-- 
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/


Mime
View raw message