lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: String search in Dismax handler
Date Fri, 24 Feb 2012 14:53:27 GMT
Watch out for StringField, that may be where you're having
trouble. Take a close look at your admin/analysis page. If
"Pass by Value" is matching on a string field when quoted,
that'll explain why it isn't matching when not quoted.

The problem here is that the query parser (before it gets to
the field analysis chain) breaks up unquoted input into separate
tokens, so you have three tokens "Pass" "by" "Value". But
StringField does NOT analyze input in any way. So your index
contains a *single* token "Pass by Value" in StringFields
and querying for "Pass" "by" "Value" will NOT match. as it's
looking for three tokens and you only have one indexed.

StringField doesn't change capitalization either. It's almost always
used for machine-generated strings so you don't have this
problem. In fact there's some discussion for deprecating it
since it seems to cause no end of confusion. If you want
something similar that allows you to, for instance, be
case insensitive, consider an analysis chain of
KeywordTokenizer and LowercaseFilter, and, perhaps,
TrimFilter.

Best
Erick

On Fri, Feb 24, 2012 at 9:08 AM, mechravi25 <mechravi25@yahoo.co.in> wrote:
> Hi,
>
> I had posted two different query strings by mistake. PFB the correct strings
> when "Pass by value" is the search word
>
> String given without quotes
>
> webapp=/solr path=/select/
> params={facet=true&f.typeFacet.facet.mincount=1&qf=name^2.3+text+x_name^0.3+id^0.3+xid^0.3&hl.fl=*&hl=true&f.rFacet.facet.mincount=1&rows=10&fl=*&start=0&q=pass+by+value&facet.field=typeFacet&facet.field=rFacet&qt=dismax}
> hits=0 status=0 QTime=63
>
> and the parsed query for the same is as follows
>
> <str name="parsedquery">+((DisjunctionMaxQuery((xid:pass^0.3 | id:pass^0.3 |
> x_name:pass^0.3 | text:pass | name:pass^2.3))
> DisjunctionMaxQuery((xid:by^0.3 | id:by^0.3))
> DisjunctionMaxQuery((xid:value^0.3 | id:value^0.3 | x_name:value^0.3 |
> text:value | name:value^2.3)))~3) ()</str>
> <str name="parsedquery_toString">+(((xid:pass^0.3 | id:pass^0.3 |
> x_name:pass^0.3 | text:pass | name:pass^2.3) (xid:by^0.3 | id:by^0.3)
> (xid:value^0.3 | id:value^0.3 | x_name:value^0.3 | text:value |
> name:value^2.3))~3) ()</str>
>
> String given with quotes
>
> webapp=/solr path=/select/
> params={facet=true&qf=name^2.3+text+x_name^0.3+id^0.3+xid^0.3&f.typeFacet.facet.mincount=1&hl.fl=*&f.rFacet.facet.mincount=1&hl=true&rows=10&fl=*&start=0&q="pass+by+value"&facet.field=typeFacet&facet.field=rFacet&qt=dismax}
> hits=4 status=0 QTime=411
>
> and its parsed query is
>
> <str name="parsedquery">+DisjunctionMaxQuery((xid:pass by value^0.3 |
> id:pass by value^0.3 | x_name:"pass ? value"^0.3 | text:"pass ? value" |
> name:"pass ? value"^2.3)) ()</str>
>
> <str name="parsedquery_toString">+(xid:pass by value^0.3 | id:pass by
> value^0.3 | x_name:"pass ? value"^0.3 | text:"pass ? value" | name:"pass ?
> value"^2.3) ()</str>
>
> and also I am using diffrent field type for id and xid and it uses
> "solr.StrField" as its class so, we have not used the solr.StopFilterFactory
> for it. the field type is as follows
>
> <fieldtype name="string" class="solr.StrField" sortMissingLast="true"
> omitNorms="true"/>
>
> But, I have used a different field type for name, text and x_name and this
> field type uses "solr.TextField" as its class and has the
> solr.StopFilterFactory as one of its filter. It is as follows
>
> <fieldType name="textgen" class="solr.TextField" positionIncrementGap="100">
> <analyzer type="index">
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt" enablePositionIncrements="true" />
> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
> generateintegerParts="1" catenateWords="1" catenateintegers="1"
> catenateAll="1" splitOnCaseChange="1" splitOnNumerics="1"
> stemEnglishPossessive="1" />
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.PhoneticFilterFactory" encoder="Soundex" inject="true"/>
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
> <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt" enablePositionIncrements="true"/>
> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
> generateintegerParts="1" catenateWords="0" catenateintegers="0"
> catenateAll="0" splitOnCaseChange="0"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> </analyzer>
> </fieldType>
>
>
> Please guide me. Thanks.
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/String-search-in-Dismax-handler-tp3766360p3772648.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message