lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <>
Subject Re: Queries not derived from the text index
Date Wed, 08 Feb 2006 13:19:24 GMT

On Feb 7, 2006, at 6:17 PM, Daniel Noll wrote:
> So a user might want to enter something like this:
>     text:camel AND tag:zoo
> In this case we would want a real FieldQuery object for the  
> text:camel portion, and a non-Lucene Query instance for the  
> "tag:zoo" portion which actually queries the Tags table in the  
> database instead of the text index.
> This is a simple example, the user might want to say:
>     text:camel AND (tag:zoo OR tag:desert)
> This kind of logic could be done, as mentioned, using an OrFilter.  
> However, there is one more case which can't be done using tricks  
> with filters:
>     (text:camel AND tag:zoo) OR (text:fish AND tag:aquarium)
> If we were to make these work as an actual Query, then the user is  
> completely free to enter what they want.

One interesting option is to subclass QueryParser and override  
getFieldQuery.  When the field is "tag", return a FilteredQuery (see  
trunk codebase, or the nightly 1.9 binaries) using a Filter that  
interfaces with your database.  Caching of the filters would be  
desirable for performance reasons.

>> It's certainly feasible to build a custom parser (JavaCC or  
>> otherwise) that does whatever you want, but that can be quite a  
>> complex endeavor.
> Actually, we don't need to change the syntax.  Our plan would be to  
> override the getFieldQuery method on QueryParser in order to drop  
> our own Query class in on top of where a "real" field query would  
> have been.

Instead of a completely custom Query, use a FilteredQuery with a  
custom Filter.

> Incidentally, we already override QueryParser in order to perform a  
> couple of boolean queries which Lucene for some reason doesn't like.
> For example:
>     NOT text:camel
> We change that on the fly to be like "dummy:1 NOT text:camel" where  
> "dummy" is just a field which exists on every document and always  
> contains "1"... a cheap trick, but it works.

In the latest codebase, there is a MatchAllDocsQuery that can be used  
in this case.  I also have implemented this sort of thing with a  
custom query parser for a client.

> Actually, if it turns out we can make Query instances that don't  
> depend on the text index, then we can fake this a little better.

I'm sure it's possible to create a Query that has nothing to do with  
the index, but I think the FilteredQuery is an easier and probably  
sufficient first attempt.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message