lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera" <ser...@gmail.com>
Subject Extending query parser with MinShouldMatch syntax
Date Sat, 13 Sep 2008 17:26:43 GMT
Hi,

I would like to suggest an extension to Lucene's query syntax, which will
allow application developers to send query constraints with a MinShouldMatch
value to the search engine, from the client application. Such constraints
are for example ACL (security information) and other filters on the queries.
Client applications simply have no way to tell the back-end to consider some
filters as min-should-match (or msm).

Suppose that I propose a file-type filter to the user, and the user typed
some keywords, like "hello world". The user gets back results, and he now
wants to filter those results by select "PDF" from the file-type filter. The
only query the client application can send to the back-end is "hello world
+filetype:pdf". But that doesn't work as expected. If queries are run with
OR operator as the default, then the documents that will be returned are
those that include filetype:pdf, and may or may not include "hello world".
This is not what the user expected though.

The only option today for the application is to parse the query, understand
that this is a msm filter (though how will it do it is not very obvious, and
not easily extendable to other filters) and set a msm on the resulting
query.

Instead, we could offer the following syntax:
- term# - defaults to msm '1'.
- term#<value> - set msm according to the specified value

What do you think?

Shai

Mime
View raw message