lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Trejkaz <trej...@trypticon.org>
Subject Re: Approches/semantics for arbitrarily combining boolean and proximity search operators?
Date Wed, 16 May 2012 23:15:34 GMT
On Thu, May 17, 2012 at 7:11 AM, Chris Harris <ryguasu@gmail.com> wrote:
> but also crazier ones, perhaps like
>
> agreement w/5 (medical and companion)
> (dog or dragon) w/5 (cat and cow)
> (daisy and (dog or dragon)) w/25 (cat not cow)
[skip]

Everything in your post matches our experience. We ended up writing
something which transforms the query as well but had to give up on
certain crazy things people tried, such as this form:

   (A and B) w/5 (C and D)

For this one:

  A w/5 (B and C)

We found the user expected the same A to be within 5 terms of both a B
and a C, and rewrote it to match that but also match more than they
asked for. So far, there have been no complaints about the overmatches
(it's documented.)

There is probably an extremely accurate way to rewrite it, but it
couldn't be figured out at the time. Maybe start with spans for A and
then remove spans not-near a B and spans not-near a C, which would
leave you with only spans near an A. The problem is that if you expand
the query to something like this, it gets quite a bit more complex, so
a user query which is already complex could turn into a really hard to
understand mess...

TX

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message