lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <markus.jel...@buyways.nl>
Subject Re: shingles work in analyzer but not real data
Date Wed, 01 Sep 2010 13:46:54 GMT
If your use-case is limited to this, why don't you encapsulate all queries in 
double quotes? 

On Wednesday 01 September 2010 14:21:47 Jeff Rose wrote:
> Hi,
>   We are using SOLR to match query strings with a keyword database, where
> some of the keywords are actually more than one word.  For example a
>  keyword might be "apple pie" and we only want it to match for a query
>  containing that word pair, but not one only containing "apple".  Here is
>  the relevant piece of the schema.xml, defining the index and query
>  pipelines:
> 
>   <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
>      <analyzer type="index">
>        <tokenizer class="solr.PatternTokenizerFactory" pattern=";"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.TrimFilterFactory" />
>      </analyzer>
>      <analyzer type="query">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.TrimFilterFactory" />
> <filter class="solr.ShingleFilterFactory" />
>       </analyzer>
>    </fieldType>
> 
> In the analysis tool this schema looks like it works correctly.  Our
> multi-word keywords are indexed as a single entry, and then when a search
> phrase contains one of these multi-word keywords it is shingled and
>  matched. Unfortunately, when we do the same queries on top of the actual
>  index it responds with zero matches.  I can see in the index histogram
>  that the terms are correctly indexed from our mysql datasource containing
>  the keywords, but somehow the shingling doesn't appear to work on this
>  live data.  Does anyone have experience with shingling that might have
>  some tips for us, or otherwise advice for debugging the issue?
> 
> Thanks,
> Jeff
> 

Markus Jelsma - Technisch Architect - Buyways BV
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350


Mime
View raw message