lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Viksit Gaur <vik.list.nu...@gmail.com>
Subject Fuzzy phrase matching using SpanQuery?
Date Tue, 29 Sep 2009 05:26:17 GMT
Hi all,

I'm trying to achieve the following, and wondered if I could get 
feedback on how best to achieve it.

Given an example phrase P - "Squeamish Ossifrage Monster", I'd like to 
search a corpus such that in a list of results,

- Docs with all 3 words in the phrase are ranked at the top

- Docs with atleast 2 of the words in that order are ranked next
-- (Say, "Ossifrage Monster" and "Squeamish Ossifrage" but not 
"Squeamish Monster")

- Docs with only one of the words come next
-- Is there a way to put these into one result set, and the first 2 
kinds in another?

The naive solution of course would be to take a phrase P, and then 
separate out all its terms (Pt_1, Pt_2, Pt_3) and then do those 3 
searches manually - seems like a colossal waste.

Is this possible with SpanQuery in some way? I also came across an 
article somewhere which said that QueryParser doesn't support SpanQuery 
- in which case, what would be the best way to actually implement this?

Cheers
Viksit

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message