lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Goddard, Michael J." <MICHAEL.J.GODD...@saic.com>
Subject RE: Request for clarification on unordered SpanNearQuery
Date Thu, 04 Mar 2010 20:03:09 GMT
Paul (and Mark),

Thank you for answering.  Do you suppose "not really straightforward" means "40 hours" or
something like that?  I'm just trying to get an idea of whether what I'm attempting is worth
the effort.

  Mike


-----Original Message-----
From: java-dev-return-47351-MICHAEL.J.GODDARD=saic.com@lucene.apache.org on behalf of Paul
Elschot
Sent: Thu 3/4/2010 11:51 AM
To: java-dev@lucene.apache.org
Subject: Re: Request for clarification on unordered SpanNearQuery
 
Michael,

The test for the 4th range fails because the first matching subspans
(for t1 in this case) is always the one that is first advanced, and the first
match at that point has a less slop (0) than the maximum allowed (1)
so one might actually try and advance another subspans first.
But that is not really straightforward to implement, especially when different
terms can be indexed in the same position.

Perhaps the javadocs for the unordered case should be improved to mention
that in the unordered case the first subspans is always the one that is
advanced first.

Regards,
Paul Elschot

Op donderdag 04 maart 2010 17:34:26 schreef Goddard, Michael J.:
> I've been working on some highlighting changes involving Spans (https://issues.apache.org/jira/browse/LUCENE-2287)
and could use some help understanding when overlapping Spans are valid.  To illustrate, I
added the test below to the TestSpans class; this test fails because there is no fourth range.
> 
> Am I wrong in my expectation that that last range would match?
> 
> Thanks.
> 
>   Mike
> 
> 
>   // Doc 11 contains "t1 t2 t1 t3 t2 t3"
>   public void testSpanNearUnOrderedOverlap() throws Exception {
>     boolean ordered = false;
>     int slop = 1;
>     SpanNearQuery snq = new SpanNearQuery(
>                               new SpanQuery[] {
>                                 makeSpanTermQuery("t1"),
>                                 makeSpanTermQuery("t2"),
>                                 makeSpanTermQuery("t3") },
>                               slop,
>                               ordered);
>     Spans spans = snq.getSpans(searcher.getIndexReader());
>     
>     assertTrue("first range", spans.next());
>     assertEquals("first doc", 11, spans.doc());
>     assertEquals("first start", 0, spans.start());
>     assertEquals("first end", 4, spans.end());
>     
>     assertTrue("second range", spans.next());
>     assertEquals("second doc", 11, spans.doc());
>     assertEquals("second start", 1, spans.start());
>     assertEquals("second end", 4, spans.end());
>     
>     assertTrue("third range", spans.next());
>     assertEquals("third doc", 11, spans.doc());
>     assertEquals("third start", 2, spans.start());
>     assertEquals("third end", 5, spans.end());
>     
>     // Question: why wouldn't this Span be found?
>     assertTrue("fourth range", spans.next());
>     assertEquals("fourth doc", 11, spans.doc());
>     assertEquals("fourth start", 2, spans.start());
>     assertEquals("fourth end", 6, spans.end());
>     
>     assertFalse("fifth range", spans.next());
>   }
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org



Mime
View raw message