lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anshum <ansh...@gmail.com>
Subject Re: Fuzzy phrase matching using SpanQuery?
Date Tue, 29 Sep 2009 14:28:23 GMT
Hi Viksit,
Why don't you try breaking the query and running a boolean boost query.
Building something like
("A B C"~1000)^100 OR ("A B"~1000 OR "B C"~1000 OR "A C"~1000)^10 OR (A OR B
OR C)

Though this is not a fool proof way to do it and a manual merge is the right
way.
Also, I remember a similar question being posted in the past. You might want
to have a look @ the archives.

--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com

The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw............


On Tue, Sep 29, 2009 at 10:56 AM, Viksit Gaur <vik.list.nutch@gmail.com>wrote:

> Hi all,
>
> I'm trying to achieve the following, and wondered if I could get feedback
> on how best to achieve it.
>
> Given an example phrase P - "Squeamish Ossifrage Monster", I'd like to
> search a corpus such that in a list of results,
>
> - Docs with all 3 words in the phrase are ranked at the top
>
> - Docs with atleast 2 of the words in that order are ranked next
> -- (Say, "Ossifrage Monster" and "Squeamish Ossifrage" but not "Squeamish
> Monster")
>
> - Docs with only one of the words come next
> -- Is there a way to put these into one result set, and the first 2 kinds
> in another?
>
> The naive solution of course would be to take a phrase P, and then separate
> out all its terms (Pt_1, Pt_2, Pt_3) and then do those 3 searches manually -
> seems like a colossal waste.
>
> Is this possible with SpanQuery in some way? I also came across an article
> somewhere which said that QueryParser doesn't support SpanQuery - in which
> case, what would be the best way to actually implement this?
>
> Cheers
> Viksit
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message