lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ellery Leung" <elleryle...@be-o.com>
Subject If search matches index in the middle of filter chain, will result return?
Date Wed, 23 Nov 2011 02:54:36 GMT
Hi all

 

I am using Solr 3.4 with Win7 and Jetty.

 

When I do a search on a field, according to the "Analysis" from Solr, the
search string matches the index in the middle of the chain.  Here is the
schema:

 

                <fieldType name="substring_search" class="solr.TextField"
positionIncrementGap="100">

                        <analyzer type="index">

                                <charFilter
class="solr.MappingCharFilterFactory"
mapping="../../filters/filter-mappings.txt"/>

                                <charFilter
class="solr.HTMLStripCharFilterFactory" />

                                <tokenizer
class="solr.KeywordTokenizerFactory"/>

                                <filter
class="solr.ASCIIFoldingFilterFactory"/>

                                <filter class="solr.TrimFilterFactory" />

                                <filter class="solr.LowerCaseFilterFactory"
/>

                                <filter
class="solr.CommonGramsFilterFactory" words="../../filters/stopwords.txt"
ignoreCase="true"/>

                                <filter class="solr.NGramFilterFactory"
minGramSize="1" maxGramSize="20"/>

                                <filter
class="solr.RemoveDuplicatesTokenFilterFactory" />

                        </analyzer>

                        <analyzer type="query">

                                <charFilter
class="solr.MappingCharFilterFactory"
mapping="../../filters/filter-mappings.txt"/>

                                <charFilter
class="solr.HTMLStripCharFilterFactory" />

                                <tokenizer
class="solr.KeywordTokenizerFactory"/>

                                <filter
class="solr.ASCIIFoldingFilterFactory"/>

                                <filter class="solr.TrimFilterFactory" />

                                <filter class="solr.LowerCaseFilterFactory"
/>

                                <filter
class="solr.RemoveDuplicatesTokenFilterFactory" />

                        </analyzer>

                </fieldType>

 

I am searching for an email called: office@officeofficeoffice.com.  If I
search any text under 20 characters, result will be returned.  But when I
search the whole string: office@officeofficeoffice.com, no result return.

 

As you all see in the schema in "index" part, when I search the whole
string, it will match the index chain before NGramFilterFactory.  But after
NGram, no result found.

 

Here are my questions:

-          Is this behavior normal?

-          In order to get "office@officeofficeoffice.com", does it mean
that I have to make the maxGramSize larger (like 70)?

 

Thank you in advance for all your support.  This is a great community.


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message