lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Klaas <>
Subject Re: Generalized proximity query performance
Date Fri, 05 Oct 2007 18:23:07 GMT
On 5-Oct-07, at 10:54 AM, Chris Hostetter wrote:

> : I am using a hand rolled query of the following form (implemented  
> with
> : SpanNearQuery, not a sloppy PhraseQuery):
> : a b c => +(a AND b AND c) OR "a b"~5 OR "b c"~5
> :
> : The obvious solution, "a b c"~5, is not applicable for my issues,  
> because I
> : would like to allow for the possibility that a and b are near  
> each other in
> : one field, while c is in another field.
> Hmmm.. can you give some more concrete examples of what you mean by  
> this?
> both in terms of the use case you are trying to satisfy, and in  
> terms of
> how your current code works ... you don't have to post code or give  
> away
> trade secrets, just describe it as a black box (ie: what is the  
> input?,
> how do you know when to use fieldA vs fieldC,how do you decide when to
> make a span query vs an OR query?
> based one what youv'e described so far, it's hard to udnerstand  
> what it is
> you are doing -- which is important to udnerstand how to help you  
> make it
> better/faster.

I understand the OP to want a PhraseQuery that has an intention  
(rather than side-effect) of doing proximity-based scoring.

"phrase query here"~1000 is the current hack that performs fine for N  
< 3 query terms, but fails currently for N >= 3 since it requires  
that all the terms be present.  For larger queries, this effectively  
nullifies the usefulness of the phrase query approach.

It doesn't seem to me that writing a variant of PhraseQuery that has  
the desired functionality would be _too_ hard, but I haven't looked  
into it in depth.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message