lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Lincoln <josh.linc...@gmail.com>
Subject Re: text search problem
Date Wed, 23 Jul 2014 13:16:44 GMT
Ravi, for the hyphen issue, try setting autoGeneratePhraseQueries=true for
that fieldType (no re-index needed). As of 1.4, this defaults to false. One
word of caution, autoGeneratePhraseQueries may not work as expected for
langauges that aren't whitespace delimited. As Erick mentioned, the
Analysis page will help you verify that your content and your queries are
handled the way you expect them to be.

See this thread for more info on autoGeneratePhraseQueries
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201202.mbox/%3C439F69A3-F292-482B-A102-7C011C576062@gmail.com%3E


On Mon, Jul 21, 2014 at 8:42 PM, Erick Erickson <erickerickson@gmail.com>
wrote:

> Try escaping the hyphen as \-. Or enclosing it all
> in quotes.
>
> But you _really_ have to spend some time with the debug option
> an admin/analysis page or you will find endless surprises.
>
> Best,
> Erick
>
>
> On Mon, Jul 21, 2014 at 11:12 AM, EXTERNAL Taminidi Ravi (ETI,
> Automotive-Service-Solutions) <external.Ravi.Taminidi@us.bosch.com> wrote:
>
> >
> > Thanks for the reply Erick, I will try as you suggested. There I have
> >  another question related to this lines.
> >
> > When I have "-" in my description , name then the search results are
> > different. For e.g.
> >
> > "ABC-123" , it look sofr ABC or 123, I want to treat this search as exact
> > match, i.e if my document has ABC-123 then I should get the results.
> >
> > When I check with &hl-on, it has <em>ABC<em> and get the results.
How can
> > I avoid this situation.
> >
> > Thanks
> >
> > Ravi
> >
> >
> > -----Original Message-----
> > From: Erick Erickson [mailto:erickerickson@gmail.com]
> > Sent: Saturday, July 19, 2014 4:40 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: text search problem
> >
> > Try adding &debug=all to the query and see what the parsed form of the
> > query is, likely you're
> > 1> using phrase queries, so "broadway hotel" requires both words in the
> > 1> text
> > or
> > 2> if you're not using phrases, you're searching for the AND of the two
> > terms.
> >
> > But debug=all will show you.
> >
> > Plus, take a look at the admin/analysis page, your tokenization may not
> be
> > what you expect.
> >
> > Best,
> > Erick
> >
> >
> > On Fri, Jul 18, 2014 at 2:00 PM, EXTERNAL Taminidi Ravi (ETI,
> > Automotive-Service-Solutions) <external.Ravi.Taminidi@us.bosch.com>
> wrote:
> >
> > > Hi,  Below is the text_general field type when I search Text:Boradway
> > > it is not returning all the records, it returning only few records.
> > > But when I search for Text:*Broadway*, it is getting more records.
> > > When I get into multiple words ln search like "Broadway Hotel", it may
> > > not get "Broadway" , "Hotel"  &  "Broadway Hotel". DO you have any
> > > thought how to handle these type of keyword search.
> > >
> > > Text:"Broadway,Vehicle Detailing,Water Systems,Vehicle Detailing,Car
> > > Wash Water Recovery"
> > >
> > > My Field type look like this.
> > >
> > > <fieldType name="text_general" class="solr.TextField"
> > > positionIncrementGap="100">
> > >       <analyzer type="index">
> > >          <charFilter class="solr.HTMLStripCharFilterFactory" />
> > >       <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> > >         <filter class="solr.StopFilterFactory" ignoreCase="true"
> > > words="stopwords.txt" />
> > >               <filter class="solr.KStemFilterFactory"/>
> > >               <filter class="solr.LowerCaseFilterFactory"/>
> > >               <filter class="solr.WordDelimiterFilterFactory"
> > > generateWordParts="0" generateNumberParts="0" splitOnCaseChange="0"
> > > splitOnNumerics="0" stemEnglishPossessive="0" catenateWords="1"
> > > catenateNumbers="1" catenateAll="1" preserveOriginal="0"/>
> > >
> > >               <!-- in this example, we will only use synonyms at query
> > time
> > >         <filter class="solr.SynonymFilterFactory"
> > > synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
> > >         -->
> > >
> > >       </analyzer>
> > >       <analyzer type="query">
> > >          <charFilter class="solr.HTMLStripCharFilterFactory" />
> > >      <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> > >               <filter class="solr.KStemFilterFactory"/>
> > >         <filter class="solr.StopFilterFactory" ignoreCase="true"
> > > words="stopwords.txt" />
> > >         <filter class="solr.SynonymFilterFactory"
> synonyms="synonyms.txt"
> > > ignoreCase="true" expand="true"/>
> > >         <filter class="solr.LowerCaseFilterFactory"/>
> > >               <filter class="solr.WordDelimiterFilterFactory"
> > > generateWordParts="0" generateNumberParts="0" splitOnCaseChange="0"
> > > splitOnNumerics="0" stemEnglishPossessive="0" catenateWords="1"
> > > catenateNumbers="1" catenateAll="1" preserveOriginal="0"/>
> > >
> > >          </analyzer>
> > >     </fieldType>
> > >
> > >
> > >
> > > Do you have any thought the behavior or how to get this?
> > >
> > > Thanks
> > >
> > > Ravi
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message