lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rounak Jain <rouna...@gmail.com>
Subject Configure Shingle Filter to ignore ngrams made of tokens with same start and end
Date Fri, 03 May 2013 14:34:28 GMT
Hello,

I was using Shingle Fitler with Suggester to implement an autosuggest
dropdown. The field I'm using with shingle filter has a worddelimiter with
preserveoriginal=1 to tokenize "women's" as "women's" and "womens."

Because of this, when shingle filter is generating word ngrams, apart from
the expected tokens, there's also a "women's womens" tokens. I wanted to
know if there's any way to configure ShingleFilter so that it ignores
tokens with same start and end values.

Thanks,
Rounak

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message