lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Noll <dan...@nuix.com>
Subject Re: Phrase search
Date Thu, 11 Jun 2009 00:18:03 GMT
On Fri, Jun 5, 2009 at 21:31, Abhi<abhirama.bhat@gmail.com> wrote:
> Say I have indexed the following strings:
>
> 1. "cool gaming laptop"
> 2. "cool gaming lappy"
> 3. "gaming laptop cool"
>
> Now when I search with a query say "cool gaming computer", I want string 1
> and 2 to appear on top (where search terms are closer to each other)
> followed by 3.
>
> I can use a Term query to search but, the problem is that word proximity
> does not come into picture. All 3 document get an even score. The behaviour
> that I want is documents that have "cool" and "gaming" and "computer" (these
> words might be present or not in the indexed document) as close to each
> other as possible should get a higher score.
>
> I can use a Phrase query so that proximity of search terms affect scoring
> but, I do not get any result because string "computer" is not present in any
> of the indexed documents.
>
> Is there a way to achieve the above?

I would rewrite it to this:

cool gaming computer "cool gaming" "gaming computer" "cool gaming computer"

Naively assuming a score of 1.0 for each hit, you would get something like...
 1. "cool gaming laptop"    => 3 (cool, gaming, "cool gaming")
 2. "cool gaming lappy"    => 3 (cool, gaming, "cool gaming")
 3. "gaming laptop cool"    => 2 (cool, gaming)

And of course if it actually finds "cool gaming computer" it would get 6.

Daniel


-- 
Daniel Noll                            Forensic and eDiscovery Software
Senior Developer                              The world's most advanced
Nuix                                                email data analysis
http://nuix.com/                                and eDiscovery software

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message