lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lukas Kahwe Smith <>
Subject facet+shingle in autosuggest
Date Thu, 11 Nov 2010 21:02:39 GMT

I am using a facet.prefix search with shingle's in my autosuggest:
    <fieldType name="shingle" class="solr.TextField" positionIncrementGap="100" stored="false"
        <tokenizer class="solr.StandardTokenizerFactory" />
        <filter class="solr.LowerCaseFilterFactory" />
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
        <filter class="solr.ShingleFilterFactory"
          maxShingleSize="3" outputUnigrams="true" outputUnigramIfNoNgram="false" />

Now I would like to prevent stop words to appear in the suggestions:

<lst name="autosuggest_shingle">
<int name="member states">52</int>
<int name="member states experiencing">6</int>
<int name="member states in">6</int>
<int name="member states the">5</int>
<int name="member states to">25</int>
<int name="member states with">7</int>

Here I would like to filter out the last 4 suggestions really. Is there a way I can sensibly
bring in a stop word filter here? Actually in theory the stop words could appear as the first
or second word as well.

So I guess when producing shingle's I want to skip any stop word from being part of any shingle.

Lukas Kahwe Smith

View raw message