lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From simon <mtnes...@gmail.com>
Subject Re: Phrase Exact Match with Margin of Error
Date Thu, 15 Jun 2017 17:44:40 GMT
I think that's because the KeywordTokenizer by definition produces a single
token (not a phrase).

Perhaps you could create two fields by a copyField - the one you already
have(field1), and one tokenized using StandardTokenizer or
WhiteSpaceTokenizer(field2) which will produce a phrase with multiple
tokens. Then construct a query which searches both  field1 for an exact
match, and field2 using ComplexQueryParser (use the localparams syntax) to
combine them. Boost the field1 (exact match).

HTH

-Simon

On Thu, Jun 15, 2017 at 1:20 PM, Max Bridgewater <max.bridgewater@gmail.com>
wrote:

> Thanks Susheel. The challenge is that if I search for the word "between"
> alone, I still get plenty of results. In a way I want the query to  match
> the document title exactly (up to a few characters) and the document title
> match the query exactly (up to a few characters). KeywordTokenizer allows
> that. But complexphrase does not seem to work with KeywordTokenizer.
>
> On Thu, Jun 15, 2017 at 10:23 AM, Susheel Kumar <susheel2777@gmail.com>
> wrote:
>
> > CompledPhraseQuery parser is what you need to look
> > https://cwiki.apache.org/confluence/display/solr/Other+
> > Parsers#OtherParsers-ComplexPhraseQueryParser.
> > See below for e.g.
> >
> >
> >
> > http://localhost:8983/solr/techproducts/select?
> debugQuery=on&indent=on&q=
> > manu:%22Bridge%20the%20gat~1%20between%20your%20skills%
> > 20and%20your%20goals%22&defType=complexphrase
> >
> > On Thu, Jun 15, 2017 at 5:59 AM, Max Bridgewater <
> > max.bridgewater@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I am trying to do phrase exact match. For this, I use
> > > KeywordTokenizerFactory. This basically does what I want to do. My
> field
> > > type is defined as follows:
> > >
> > > <fieldType name="myDummyExactMatch"  class="solr.TextField"
> > > positionIncrementGap="100">
> > >       <analyzer type="index">
> > >         <tokenizer class="solr.KeywordTokenizerFactory"/>
> > >         <filter class="solr.LowerCaseFilterFactory"/>
> > >       </analyzer>
> > >       <analyzer type="query">
> > >         <tokenizer class="solr.KeywordTokenizerFactory"/>
> > >         <filter class="solr.LowerCaseFilterFactory"/>
> > >       </analyzer>
> > >     </fieldType>
> > >
> > >
> > > In addition to this, I want to tolerate typos of two or three letters.
> I
> > > thought fuzzy search could allow me to accept this margin of error. But
> > > this doesn't seem to work.
> > >
> > > A typical query I would have is:
> > >
> > > q=subjet:"Bridge the gap between your skills and your goals"
> > >
> > > Now, in this query, if I replace gap with gat, I was hoping I could do
> > > something such as:
> > >
> > > q=subjet:"Bridge the gat between your skills and your goals"~0.8
> > >
> > > But this doesn't quite do what I am trying to achieve.
> > >
> > > Any suggestion?
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message