lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Morus Walter <morus.wal...@tanto-xipolis.de>
Subject Re: Searching for null value?
Date Fri, 11 Apr 2003 13:27:28 GMT
petite_abeille writes:
> 
> I see. What about the other way around? Is there a way to express a 
> query for the existence of a field, no matter what its value might be? 
> Something like: search for all documents with a field named "category"?
> 
I have a small class listing all values for a given field.
This should be extendable to listing all documents, however you'd get
multiple references for documents having more than one token in category.

import org.apache.lucene.index.IndexReader;
import org.apache.lucene.search.WildcardTermEnum;
import org.apache.lucene.index.Term;

class ListTerms {
    public static void main(String[] args) {
	if ( args.length != 3 ) {
	    System.out.println("usage: java ListTerms <index> <field> <wildcard expression>");
	    System.exit(0);
	}
	try {
	    IndexReader reader = IndexReader.open(args[0]);
	    Term term = new Term(args[1], args[2]);
	    WildcardTermEnum enum = new WildcardTermEnum(reader, term);
	    System.out.print(enum.docFreq() + ": ");
	    System.out.println(enum.term().text());
	    while ( enum.next() ) {
		System.out.print(enum.docFreq() + ": ");
		System.out.println(enum.term().text());
	    }

	} catch (Exception e) {
	    System.err.println(" caught a " + e.getClass() +
			       "\n with message: " + e.getMessage());
	}
    }
}

The arguments are the index, the fieldname and a wildcard expression for the 
field value, e.g. "*" for anything.

What you need in the loop is another loop over the documents which
should be something like

	TermDocs termDocs = reader.termDocs();
	termDocs.seek(enum.term());
	try {
	    while (termDocs.next()) {
	        // output document
		}
	} finally {
	    termDocs.close();
	}

If you expect duplicated entries, you might use a BitSet to flag the
documents and output them afterwards.
BTW: this gives an answer to your original question. If you output the
documents not flaged, you will have the ones without this field.

Of course this is not a "query" in the usual sense.

HTH
	Morus

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message