lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven A Rowe <sar...@syr.edu>
Subject RE: question about StandardAnalyzer, differences between solr 1.4 and solr 3.3
Date Fri, 09 Sep 2011 13:40:18 GMT
Hi Marc,

StandardAnalyzer includes StopFilter.  See the Javadocs for Lucene 3.3 here: <http://lucene.apache.org/java/3_3_0/api/all/org/apache/lucene/analysis/standard/StandardAnalyzer.html>

This is not new behavior - StandardAnalyzer in Lucene 2.9.1 (the version of Lucene bundled
with Solr 1.4) also includes a StopFilter: <http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/analysis/standard/StandardAnalyzer.html>

If you don't want a StopFilter configured, you can specify the individual components directly,
e.g. to get the equivalent of StandardAnalyzer, but without the StopFilter:

<fieldtype name="text" class="solr.TextField">
  <analyzer>
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StandardFilterFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
</fieldtype>

Steve

> -----Original Message-----
> From: Marc Des Garets [mailto:marc.desgarets@192.com]
> Sent: Friday, September 09, 2011 6:21 AM
> To: solr-user@lucene.apache.org
> Subject: question about StandardAnalyzer, differences between solr 1.4
> and solr 3.3
> 
> Hi,
> 
> I have a simple field defined like this:
>     <fieldtype name="text" class="solr.TextField">
>       <analyzer
> class="org.apache.lucene.analysis.standard.StandardAnalyzer"/>
>     </fieldtype>
> 
> Which I use here:
>    <field name="middlename" type="text" indexed="true" stored="true"
> required="false" />
> 
> In solr 1.4, I could do:
> ?q=(middlename:a*)
> 
> And I was getting all documents where middlename = A or where middlename
> starts by the letter A.
> 
> In solr 3.3, I get only results where middlename starts by the letter A
> but not where middlename is equal to A.
> 
> The thing is this happens only with the letter A, with other letters, it
> is fine, I get the ones starting by the letter and the ones equal to the
> letter. My guess is that it considers A as the English article but I do
> not specify any filter with stopwords so how come the behaviour with the
> letter A is different from the other letters? Is there a bug? How can I
> change my field to work with the letter A, the same way it does with
> other letters.
> 
> 
> Thanks,
> Marc
> ----------------------------------------------------------
> This transmission is strictly confidential, possibly legally privileged,
> and intended solely for the
> addressee.  Any views or opinions expressed within it are those of the
> author and do not necessarily
> represent those of 192.com, i-CD Publishing (UK) Ltd or any of it's
> subsidiary companies.  If you
> are not the intended recipient then you must not disclose, copy or take
> any action in reliance of this
> transmission. If you have received this transmission in error, please
> notify the sender as soon as
> possible.  No employee or agent is authorised to conclude any binding
> agreement on behalf of
> i-CD Publishing (UK) Ltd with another party by email without express
> written confirmation by an
> authorised employee of the Company. http://www.192.com (Tel: 08000 192
> 192).  i-CD Publishing (UK) Ltd
> is incorporated in England and Wales, company number 3148549, VAT No. GB
> 673128728.
Mime
View raw message