lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Krupansky <jack.krupan...@gmail.com>
Subject Re: Text search NGram
Date Mon, 07 Mar 2016 14:54:26 GMT
The charFilter isn't doing anything useful - the white space tokenzier will
ignore extra white space anyway.

-- Jack Krupansky

On Mon, Mar 7, 2016 at 5:44 AM, G, Rajesh <rg@cebglobal.com> wrote:

> Hi Team,
>
> We have the blow type and we have indexed the value  "title": "Microsoft
> Visual Studio 2006" and "title": "Microsoft Visual Studio 8.0.61205.56
> (2005)"
>
> When I search for title:(Microsoft Visual AND Studio AND 2005)  I get
> Microsoft Visual Studio 8.0.61205.56 (2005) as the second record and
> Microsoft Visual Studio 2006 as first record. I wanted to have Microsoft
> Visual Studio 8.0.61205.56 (2005) listed first since the user has searched
> for Microsoft Visual Studio 2005. Can you please help?.
>
> We are using NGram so it takes care of misspelled or jumbled words[it
> works as expected]
> e.g.
> searching Micrs Visual Studio will gets Microsoft Visual Studio
> searching Visual Microsoft Studio will gets Microsoft Visual Studio
>
>   <fieldType name="txt_token" class="solr.TextField"
> positionIncrementGap="0" >
>                 <analyzer type="index">
>                                 <charFilter
> class="solr.PatternReplaceCharFilterFactory" pattern="\s+" replacement=" "/>
>                                 <tokenizer
> class="solr.WhitespaceTokenizerFactory"/>
>                                 <filter
> class="solr.LowerCaseFilterFactory"/>
>                                 <filter class="solr.NGramFilterFactory"
> minGramSize="2" maxGramSize="800"/>
>                 </analyzer>
>                  <analyzer type="query">
>                                 <charFilter
> class="solr.PatternReplaceCharFilterFactory" pattern="\s+" replacement=" "/>
>                                 <tokenizer
> class="solr.WhitespaceTokenizerFactory"/>
>                                 <filter
> class="solr.LowerCaseFilterFactory"/>
>                                 <filter class="solr.NGramFilterFactory"
> minGramSize="2" maxGramSize="800"/>
>                 </analyzer>
>   </fieldType>
>
>
>
> Corporate Executive Board India Private Limited. Registration No:
> U741040HR2004PTC035324. Registered office: 6th Floor, Tower B, DLF Building
> No.10 DLF Cyber City, Gurgaon, Haryana-122002, India..
>
>
>
> This e-mail and/or its attachments are intended only for the use of the
> addressee(s) and may contain confidential and legally privileged
> information belonging to CEB and/or its subsidiaries, including CEB
> subsidiaries that offer SHL Talent Measurement products and services. If
> you have received this e-mail in error, please notify the sender and
> immediately, destroy all copies of this email and its attachments. The
> publication, copying, in whole or in part, or use or dissemination in any
> other way of this e-mail and attachments by anyone other than the intended
> person(s) is prohibited.
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message