lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Engels" <reng...@ix.netcom.com>
Subject non indexed field searching?
Date Tue, 16 May 2006 19:37:33 GMT
I know I've (and others have brought this up before), but maybe now with the
lazy field loading (seemingly due to larger documents being stored) it is
time to revisit.

It seems that maybe a query could be separated into Filter and Query clauses
(similar to how the query optimizer works in Nutch). Clauses that were based
on non-indexed fields would be converted to a Filter.

The problem is if you have some thing like

(indexed:somevalue OR nonindexed:somevalue)

would require a complete visit to every document.

But something like

(indexed:somevalue AND nonindexed:somevalue)

would be very efficient.

I understand that this is moving Lucene closer to a database, but it is just
very difficult to perform some complex queries efficiently without it.

*** As an aside, I still don't understand why Filter is not an interface

interface Filter {
    boolean include(IndexReader reader,int doc)
}

and then you would have

NonIndexedFilter(String fieldname,String expression) implements Filter
    boolean include(IndexReader reader,int doc) {
        Document d = reader.document(doc);
        String val = d.getValue(fieldname);
        return {evaluate expression against val};
}

Filter being an interface should incur very little overhead in the common
case where it was backed by a BitSet as the modern JVM will inline it.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message