lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From markharw00d <markharw...@yahoo.co.uk>
Subject Re: "Advanced" query language
Date Sat, 03 Dec 2005 18:00:54 GMT
Erik Hatcher wrote:

> Rest assured that human-readable query expressions aren't going away  
> at all.  I don't think Mark even implied that.


That's right. The proposal is *not* to replace what is already there - 
QueryParser will always have a useful role to play supporting the 
"Google-like" query syntax familiar to millions.
I'd just like to see another full-featured query representation for the 
reasons already outlined.

Picking up on some points raised:

Re: MoreLikeThis queries.
Yes, they can be usefully wrapped as queries (see attached simple 
example). In fact it was  my attempts at bastardising QueryParser to 
support them that brought home it's limitations. I ended up with a 
subclass hack that (mis)used the field name to parse a query string 
"like:123" where 123 was a doc id. With the QueryParser syntax I was not 
able to pass other parameters which MoreLikeThis could usefully use to 
control the behaviour of this query type eg choice of fieldname(s) used, 
max number of terms generated, minNumberShouldTerms to match etc etc.
This is not unusual, each query type has potentially multiple optional 
parameters that tweak it's behaviour. If I don't have a query language 
that names the parameters explicitly (say, XML) I end up having to 
define what looks like a function with a long list of parameters: "like 
(123,,,4,,,)". Ack.

Here's a psuedo-code example that throws together some of the more 
obscure parts of Lucene not represented in the existing QueryParser as 
an illustration of how this could look in a more wide-reaching parser.
Imagine the user has selected an example doc #44 as something they are 
interested in, on the subject of "hockey" but they prefer to see 
documents that don't talk about ice hockey

<BoostingQuery>
             <MatchQuery>
                         <MoreLikeThisQuery percentTermsToMatch="0.25f" 
docId="44">
                                     <CompareField name="contents"/>
                                     <CompareField name="title"/>
                         </MoreLikeThis>
             </MatchQuery>
             <DowngradeQuery demoteValue="0.5" >
                      <SimpleQuery defaultField="contents">
                                <queryText>"ice hockey" OR puck OR 
rink</queryText>
                      </SimpleQuery>
             </DowngradeQuery>
</BoostingQuery>

BoostingQuery is a class that can use a second query to demote the 
results of a first query if it matches (see here: 
http://wiki.apache.org/jakarta-lucene/CommunityContributions)
For this and other forms of query to be able to plug into new parser the 
Query objects just need to adhere to bean conventions to be 
automatically wired in an ANT/Spring like way using reflection.
For example,  the implementation of BoostingQuery would need to have 
getter/setter properties for "MatchQuery" and "downgradeQuery".
Note in this example that the existing QueryParser syntax is usefully 
used in "SimpleQuery" to avoid making the XML too verbose.

There's much detail to be added in how this would work in practice but I 
thought I'd post it here to show the general shape of one possible 
direction.







Mime
View raw message