lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: Phrase query vs span query
Date Wed, 22 Feb 2006 07:11:40 GMT

: Rank 3: Documents containing atleast n (n  < N, where N is total number of
: query terms) in the same section and in order

that's a non-trivial goal in itself -- even without the "in the same
section" restriction, i can't think of a way to do that off the top of my
head other then a Span query for every Combination ... I think you may
need a custom query for that.

: However, the number of query terms is what i am more concerned about when it
: comes to implementing using Phrase query versus Span query in terms of query
: speed.

I don't know if anyone has done any benchmarks of Phrase vs Span that fit
your description.  it shoudl be fairly easy to crank some out.

: > I have no idea if it will be faster/slower then a span query, but it's a
: > little simpler because you don't need to use artificial section boundry
: > tokens.

: Wouldnt the  phrase/span query match across sections if I do not introduced
: the artifical section boundry?

if the position increment gap between sections is larger then the slop
factor you query with, then you don't need an artificial boundary token.

The subject of creating artificial tokens to indicate
chapter/section/paragraph/sentence boundaries and then using
SpanNotQueries to make sure searches don't cross one of those boundaries
-- but it doesn't sound like that's something you are worried about.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message