lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amrit Sarkar <sarkaramr...@gmail.com>
Subject Re: Make search on the particular field to be case sensitive
Date Thu, 09 Nov 2017 15:50:01 GMT
Ah ok.

I didn't test and laid it over. Thank you Erick for correcting me out.

On 9 Nov 2017 9:06 p.m., "Erick Erickson" <erickerickson@gmail.com> wrote:

> This won't quite work. "string" types are totally un-analyzed you
> cannot add filters to a solr.StrField, you must use solr.TextField
> rather than solr.StrField.
>
>
> <fieldType name="string" class="solr.TextField" sortMissingLast="true"
> docValues="true"/>
> <analyzer>
>       <tokenizer class="solr.KeywordTokenizerFactory"/>
>        <filter class="solr.LowerCaseFilterFactory"/>
>  </analyzer>
>  </fieldType>
>
>
> start over and re-index from scratch in a new collection of course.
>
> You also need to make sure you really want to search on the whole
> field. The KeywordTokenizerFactory doesn't split the incoming test up
> _at all_. So if the input is
> "my dog has fleas" you can't search for just "dog" unless you use the
> extremely inefficient *dog* form. If you want to search for words, use
> an tokenizer that breaks up the input, WhitespaceTokenizer for
> instance.
>
> Best,
> Erick
>
> On Thu, Nov 9, 2017 at 3:24 AM, Amrit Sarkar <sarkaramrit2@gmail.com>
> wrote:
> > Behavior of the field values is defined by fieldType analyzer
> declaration.
> >
> > If you look at the managed-schema;
> >
> > You will find fieldType declarations like:
> >
> > <fieldType name="text_en" class="solr.TextField"
> positionIncrementGap="100">
> >> <analyzer type="index"> <tokenizer class="solr.
> StandardTokenizerFactory"/>
> >> <filter class="solr.StopFilterFactory" words="lang/stopwords_en.txt"
> >> ignoreCase="true"/> <filter class="solr.LowerCaseFilterFactory"/>
> <filter
> >> class="solr.EnglishPossessiveFilterFactory"/> <filter class=
> >> "solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> <filter
> >> class="solr.PorterStemFilterFactory"/> </analyzer> <analyzer
> type="query">
> >> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class=
> >> "solr.SynonymGraphFilterFactory" expand="true" ignoreCase="true"
> synonyms=
> >> "synonyms.txt"/> <filter class="solr.StopFilterFactory" words=
> >> "lang/stopwords_en.txt" ignoreCase="true"/> <filter class=
> >> "solr.LowerCaseFilterFactory"/> <filter class=
> >> "solr.EnglishPossessiveFilterFactory"/> <filter class=
> >> "solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> <filter
> >> class="solr.PorterStemFilterFactory"/> </analyzer> </fieldType>
> >
> >
> > In you case fieldType is "string". *You need to write analyzer chain for
> > the same fieldType and don't include:*
> >  <filter class="solr.LowerCaseFilterFactory"/>
> >
> > LowerCaseFilterFactory is responsible lowercase the token coming in query
> > and while indexing.
> >
> > Something like this will work for you:
> >
> > <fieldType name="string" class="solr.StrField" sortMissingLast="true"
> > docValues="true"/>
> > <analyzer> <tokenizer class="solr.KeywordTokenizerFactory"/>
> </analyzer> </
> > fieldType>
> >
> > I listed "KeywordTokenizerFactory" considering this is string, not text.
> >
> > More details on: https://lucene.apache.org/solr/guide/6_6/analyzers.html
> >
> > Amrit Sarkar
> > Search Engineer
> > Lucidworks, Inc.
> > 415-589-9269
> > www.lucidworks.com
> > Twitter http://twitter.com/lucidworks
> > LinkedIn: https://www.linkedin.com/in/sarkaramrit2
> > Medium: https://medium.com/@sarkaramrit2
> >
> > On Thu, Nov 9, 2017 at 4:41 PM, Karan Saini <maximus392@gmail.com>
> wrote:
> >
> >> Hi guys,
> >>
> >> Solr version :: 6.6.1
> >>
> >> *<field name="NameLine1" type="string" indexed="true" stored="true" />*
> >>
> >> I have around 10 fields in my core. I want to make the search on this
> >> specific field to be case sensitive. Please advise, how to introduce
> case
> >> sensitivity at the field level. What changes do i need to make for this
> >> field ?
> >>
> >> Thanks,
> >> Karan
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message