lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Arslan <iori...@yahoo.com>
Subject Re: How can i get collect stemmed query?
Date Mon, 18 Oct 2010 10:33:39 GMT
rawquerystring = +body:flyaway
parsedquery = +body:fly +body:away

shows that your custom filter is working as you expected.

However you are using different tokenizers in query (standardtokenizer hard-coded) and index
(whitespacetokenizer) time. That may cause numFound=0.  

For example if your indexed document contains 'fly, away' in its body field, your query won't
return it. Because of comma. 

admin/analysis.jsp shows indexed tokens. 

You can issue a *:* query to see if that document really exists.
q=*:*&fl=body

Your query analyzer definition should look like   :
<analyzer type="query"  class="com.testsolr.ir.customAnalyzer.MyCustomQueryAnalyzer"  />

you cannot have both an analyzer and a tokenizer at the same time.

Once you get this working, in your case it is better to write a custom filter factory plug-in
and define query analyzer using it. ( for performance reason)
And you can load your plug-in easier : http://wiki.apache.org/solr/SolrPlugins#How_to_Load_Plugins

<analyzer type="query">
          <tokenizer class="solr.StandardTokenizerFactory"/>
          <filter class="solr.LengthFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="com.testsolr.ir.KLTQueryStemFilter"/>
      </analyzer>


--- On Mon, 10/18/10, Jerad <age2k@naver.com> wrote:

> From: Jerad <age2k@naver.com>
> Subject: Re: How can i get collect stemmed query?
> To: solr-user@lucene.apache.org
> Date: Monday, October 18, 2010, 12:14 PM
> 
> Oops, I'm Sorry! I found some mistakes on previous posted
> source.( Main class
> name has been wrong :<)
> 
> This is the collect analyzer source.
> -------------------------------------------------------------------------------------------------------
> public class MyCustomQueryAnalyzer extends Analyzer{ 
>     public static final Version LUCENE_VERSION =
> Version.LUCENE_29; 
>     public static int QUERY_MIN_LEN_WORD_FILTER =
> 1; 
>     public static int QUERY_MAX_LEN_WORD_FILTER =
> 40; 
>         
>     public int elapsedTime = 0; 
>         
>     @Override 
>     public TokenStream tokenStream(String
> paramString, Reader reader) { 
>         StandardTokenizer tokenizer =
> new StandardTokenizer( 
>            
> du.utas.mcrdr.ir.lucene.WebDocIR.LUCENE_VERSION, reader ); 
> 
>         TokenStream tokenStream = new
> LengthFilter( tokenizer,
> QUERY_MIN_LEN_WORD_FILTER, 
>          
>    QUERY_MAX_LEN_WORD_FILTER ); 
>         tokenStream = new
> LowerCaseFilter( tokenStream ); 
> 
> 
>         //My custom stemmer method 
>         MyCustomSingleWordStemmer
> stemer = new
> MyCustomSingleWordStemmer(QUERY_MIN_LEN_WORD_FILTER,
> QUERY_MAX_LEN_WORD_FILTER); 
> 
>         //My custom analyzer filter.
> this filter return sub-merged query. 
>         //ex) INPUT : flyaway 
>         // 
>    RETURN VALUE : fly +body:away 
>         tokenStream = new
> KLTQueryStemFilter( tokenStream, stemer, this ); 
> 
>         return tokenStream; 
>     } 
> } 
> 
> -------------------------------------------------------------------------------------------------------
> 
> [Additional info]
> 
> 1. MyCustomQueryAnalyzer made outside of Solr.
>     I made this analyzer outside of the solr
> package and make it to ~.jar
> and located at 
> 
>    
> ~/Solr/example/work/Jetty_0_0_0_0_8982_solr.war__solr__-2c5peu/webapp/WEB-INF/lib
> 
> 
> 2. I edited field type and field name in scheme.xml which
> to be searched.
> 
>     <field name="body" type="textTp"
> indexed="true" stored="true"
> omitNorms="true"/>
> 
>     <fieldType name="textTp"
> class="solr.TextField">
>       <analyzer type="index">
>           <tokenizer
> class="solr.WhitespaceTokenizerFactory"/>
>         <filter
> class="solr.LowerCaseFilterFactory"/>
>       </analyzer>
>       <analyzer type="query"
> class="com.testsolr.ir.customAnalyzer.MyCustomQueryAnalyzer">
>         <tokenizer
> class="solr.WhitespaceTokenizerFactory"/>
>       </analyzer>
>     </fieldType>
> 
>     This is my custom scheme.xml and custom
> search field type.
> 
> 3. I've got this xml result when I append
> &debugQuery=on to my search url.
> 
> ------------------------------------------------------------------------
>   <?xml version="1.0" encoding="UTF-8" ?> 
> - <response>
> - <lst name="responseHeader">
>   <int name="status">0</int> 
>   <int name="QTime">0</int> 
> - <lst name="params">
>   <str name="debugQuery">on</str> 
>   <str name="indent">on</str> 
>   <str name="start">0</str> 
>   <str name="q">+body:flyaway</str> 
>   <str name="version">2.2</str> 
>   <str name="rows">10</str> 
>   </lst>
>   </lst>
>   <result name="response" numFound="0" start="0"
> /> 
> - <lst name="debug">
>   <str
> name="rawquerystring">+body:flyaway</str> 
>   <str
> name="querystring">+body:flyaway</str> 
>   <str name="parsedquery">+body:fly
> +body:away/str> 
>   <str name="parsedquery_toString">+body:fly
> +body:away/str> 
>   <lst name="explain" /> 
>   <str name="QParser">LuceneQParser</str>
> 
> - <lst name="timing">
>   <double name="time">0.0</double> 
> - <lst name="prepare">
>   <double name="time">0.0</double> 
> - <lst
> name="org.apache.solr.handler.component.QueryComponent">
>   <double name="time">0.0</double> 
>   </lst>
> - <lst
> name="org.apache.solr.handler.component.FacetComponent">
>   <double name="time">0.0</double> 
>   </lst>
> - <lst
> name="org.apache.solr.handler.component.MoreLikeThisComponent">
>   <double name="time">0.0</double> 
>   </lst>
> - <lst
> name="org.apache.solr.handler.component.HighlightComponent">
>   <double name="time">0.0</double> 
>   </lst>
> - <lst
> name="org.apache.solr.handler.component.StatsComponent">
>   <double name="time">0.0</double> 
>   </lst>
> - <lst
> name="org.apache.solr.handler.component.DebugComponent">
>   <double name="time">0.0</double> 
>   </lst>
>   </lst>
> - <lst name="process">
>   <double name="time">0.0</double> 
> - <lst
> name="org.apache.solr.handler.component.QueryComponent">
>   <double name="time">0.0</double> 
>   </lst>
> - <lst
> name="org.apache.solr.handler.component.FacetComponent">
>   <double name="time">0.0</double> 
>   </lst>
> - <lst
> name="org.apache.solr.handler.component.MoreLikeThisComponent">
>   <double name="time">0.0</double> 
>   </lst>
> - <lst
> name="org.apache.solr.handler.component.HighlightComponent">
>   <double name="time">0.0</double> 
>   </lst>
> - <lst
> name="org.apache.solr.handler.component.StatsComponent">
>   <double name="time">0.0</double> 
>   </lst>
> - <lst
> name="org.apache.solr.handler.component.DebugComponent">
>   <double name="time">0.0</double> 
>   </lst>
>   </lst>
>   </lst>
>   </lst>
>   </response>
> ------------------------------------------------------------------------
> 
> I really appreciate your advice~ :)
> 
> -- 
> View this message in context: http://lucene.472066.n3.nabble.com/How-can-i-get-collect-search-result-from-custom-filtered-query-tp1723055p1723815.html
> Sent from the Solr - User mailing list archive at
> Nabble.com.
> 


      

Mime
View raw message