lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Newburn <jnewb...@zappos.com>
Subject Ngram Repeats
Date Wed, 24 Dec 2008 15:38:35 GMT
I have set up an ngram filter and have run into a problem.  Our index is
basically composed of products as the unique id.  Each product also has a
brand name assigned to it.  There are much fewer unique brand names than
products in the index.  I tried to set up an ngram based on the brand name
but it is returning the same brand name over and over for each product.
Essentially if you try for the brand name starting with ³as² you will get
the brand ³asus² 15 times.  Is there a way to make the ngram only return
unique brand name?  I have attached the configuration below.

        <fieldType name="prefix_token" class="solr.TextField"
positionIncrementGap="1">
                <analyzer type="index">
                        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
                        <filter class="solr.LowerCaseFilterFactory" />
                        <filter class="solr.EdgeNGramFilterFactory"
minGramSize="1" maxGramSize="20"/>
                </analyzer>
                <analyzer type="query">
                        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
                        <filter class="solr.LowerCaseFilterFactory" />
                </analyzer>
        </fieldType>
-Jeff

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message