lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Elschot <paul.elsc...@xs4all.nl>
Subject Re: Proximity Query Parser
Date Fri, 01 Sep 2006 18:45:13 GMT
On Friday 01 September 2006 19:46, Mark Miller wrote:
> Eric also gave me the idea of using a SpanNear with maximum slop as a
> boolean to connect spans. Using this and SpanOr seems to make my time spent
> on the distribution of proximity clauses a little foolish :) Is that true?

There is practice and there is theory. You chose practice this time.
(In theory there is no difference between the two, but in practice...)

> Is there any disadvantage to the max slop Spannear, SpanOr solution? Any
> advantage to distributing the 'and's?

Span queries (and phrase queries) access the proximity information,
and that slows them down when compared to pure boolean queries,
which can get away by using only the the term frequencies in the
documents. The difference in access time is roughly as big as these
term frequencies.
When querying an index with larger documents, the difference can be
quite noticable. However, using proximity information normally
gives more accurate results. With operators in the query language,
the choice is up to the user.

Similarly, phrase queries are faster than span queries, but phrase queries
cannot be nested. Ideally, a query language would hide this, but
this requires an implementation in which phrase queries treat slop
in the same way as span queries.
 
Regards,
Paul Elschot

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message