lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doron Cohen <cdor...@gmail.com>
Subject Re: SpanNearQuery - inOrder parameter
Date Thu, 19 May 2011 09:22:13 GMT
Hi Greg,

I created http://issues.apache.org/jira/browse/LUCENE-3120 for this problem,
and attached there a more general test that exposes this problem, based on
your test case.

I am not sure yet that this is indeed  a problem to be fixed with regard to
span queries (see more there in JIRA) but at least I would like the test
case to be in JIRA.

Could you provide some context on why is this a problem for you?
That is, what is preventing you from specifying in-order?

Doron

On Tue, May 17, 2011 at 12:56 PM, Gregory Tarr <Gregory.tarr@detica.com>wrote:

> Anyone else able to reply to this?
>
> Thanks
>
> Greg
>
> -----Original Message-----
> From: Gregory Tarr
> Sent: 13 May 2011 15:46
> To: 'java-user@lucene.apache.org'
> Subject: RE: SpanNearQuery - inOrder parameter
>
> Chris, and others
>
> Thanks for your reply. In effect what you are saying is that
> SpanNearQuery works as expected, and I should set inOrder=true to obtain
> the behaviour I require, even though I don't care about the order?
>
> Thanks
>
> Greg
>
> -----Original Message-----
> From: Chris Hostetter [mailto:hossman_lucene@fucit.org]
> Sent: 11 May 2011 00:32
> To: java-user@lucene.apache.org
> Subject: RE: SpanNearQuery - inOrder parameter
>
>
>
> : I attach a junit test which shows strange behaviour of the inOrder
> : parameter on the SpanNearQuery constructor, using Lucene 2.9.4.
> :
> : My understanding of this parameter is that true forces the order and
> : false doesn't care about the order.
> :
> : Using true always works. However using false works fine when the terms
> : in the query are distinct, but if they are equivalent, e.g. searching
> : for "john john", I do not get the expected results. The workaround
> seems
> : to be to always use true for queries with repeated terms.
>
> I don't think the situation of "overlapping spans" has changed much
> since this thread...
>
> http://search.lucidimagination.com/search/document/ee23395e5a93c525/non_
> overlapping_span_queries#868b3a3ec6431afc
>
> the crux of hte issue (as i recall) is that there is really no
> conecptual reason to why a query for "'john' near 'john', in any order,
> with slop of Z" shouldn't match a doc that contains only one instance of
> "john" ... the first SpanTermQuery says "i found a match at position X"
> the second SpanTermQuery says "i found a match at position Y" and the
> SpanNearQuery says "the differnece between X and Y is less then Z"
> therefore i have a match.  (The SpanNearQuery can't fail just because X
> and Y are the same -- they might be two distinct term instances, with
> differnet payloads perhaps, that just happen to have the same position).
>
> However: if true==inOrder case works because the SpanNearQuery enforces
> that  "X must be less then Y" so the same term can't ever match twice.
>
>
>
> -Hoss
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> Please consider the environment before printing this email.
>
> This message should be regarded as confidential. If you have received this
> email in error please notify the sender and destroy it immediately.
> Statements of intent shall only become binding when confirmed in hard copy
> by an authorised signatory.  The contents of this email may relate to
> dealings with other companies within the Detica Limited group of companies.
>
> Detica Limited is registered in England under No: 1337451.
>
> Registered offices: Surrey Research Park, Guildford, Surrey, GU2 7YP,
> England.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message