lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Elschot <>
Subject Re: Lucene syntax query matched against a string content
Date Fri, 08 Feb 2008 08:52:12 GMT
Without using a RAMDirectory index it would be necessary to
implement all Scorers used by the query directly top of the token
stream that normally goes into the index. This is possible, but
Lucene is not designed to do this, so it won't be easy.

But especially for more preparsed queries against a small set of
new documents, this might be nice to have. Still, even for that
case, it would only gain performance over using RAMDirectory
when the queries can be evaluated from the ground up,
sharing as many subqueries as possible. And that is
just the opposite of the top down way query search is
currently implemented on a prebuilt index.

The basic design for this would be to start from a set of queries
to be 'analyzed' to make them share as many subqueries
as possible, building a query graph.
Then this query graph would be fed the new documents
one by one, resulting in a score for each matching query
that was added to the query graph.
It is possible, but it would be quite a bit of work.

And then someone will come along with the requirement
to match an existing index against such a query graph,
which is not a bad idea either, but it might need yet another
way of collecting the results.

Paul Elschot

Op Friday 08 February 2008 05:48:08 schreef Nilesh Bansal:
> Hi,
> I want to create a function, which takes in a query string (in lucene
> syntax), and a string as content and returns back if the query matches
> the content or not. This would mean,
> query = +(apache) +(lucene OR httpd)
> will match
> content = HTTPD by Apache foundation is one of the most popular open
> source projects
> and will not match
> content = Lucene and httpd are projects from same open source foundation
> Basically, I need to fill in the contents of the following Java
> function. This should be easy to do, but I don't know how. I obviously
> don't want to create a dummy lucene index in memory with a single
> document and then search for the query against that (for performance
> reasons).
> public static boolean isRelevant(String luceneQuery, String contents) {
>   // TODO fill in
> }
> Instead of boolean, it could return a relevance score, which will be
> zero if the query is not relevant to the document.
> Any help will be appreciated.
> thanks
> Nilesh

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message