lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Savvas-Andreas Moysidis <savvas.andreas.moysi...@googlemail.com>
Subject Re: Phrase search
Date Thu, 11 Jun 2009 08:25:42 GMT
Hello,

You could use a PhraseQuery with the terms "cool" and "gaming" and
"computer" and set the slop factor you reckon is right. Then could assign a
boost to this query only, which will make it bubble up the list.
I don't think you can get away without specifying a slop factor though(like
in the proximity scenario you mention).

Regards,
Savvas


2009/6/11 Daniel Noll <daniel@nuix.com>

> On Fri, Jun 5, 2009 at 21:31, Abhi<abhirama.bhat@gmail.com> wrote:
> > Say I have indexed the following strings:
> >
> > 1. "cool gaming laptop"
> > 2. "cool gaming lappy"
> > 3. "gaming laptop cool"
> >
> > Now when I search with a query say "cool gaming computer", I want string
> 1
> > and 2 to appear on top (where search terms are closer to each other)
> > followed by 3.
> >
> > I can use a Term query to search but, the problem is that word proximity
> > does not come into picture. All 3 document get an even score. The
> behaviour
> > that I want is documents that have "cool" and "gaming" and "computer"
> (these
> > words might be present or not in the indexed document) as close to each
> > other as possible should get a higher score.
> >
> > I can use a Phrase query so that proximity of search terms affect scoring
> > but, I do not get any result because string "computer" is not present in
> any
> > of the indexed documents.
> >
> > Is there a way to achieve the above?
>
> I would rewrite it to this:
>
> cool gaming computer "cool gaming" "gaming computer" "cool gaming computer"
>
> Naively assuming a score of 1.0 for each hit, you would get something
> like...
>  1. "cool gaming laptop"    => 3 (cool, gaming, "cool gaming")
>  2. "cool gaming lappy"    => 3 (cool, gaming, "cool gaming")
>  3. "gaming laptop cool"    => 2 (cool, gaming)
>
> And of course if it actually finds "cool gaming computer" it would get 6.
>
> Daniel
>
>
> --
> Daniel Noll                            Forensic and eDiscovery Software
> Senior Developer                              The world's most advanced
> Nuix                                                email data analysis
> http://nuix.com/                                and eDiscovery software
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message