lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Penney" <karl.pen...@eastech.ca>
Subject How to get list of unique field values for a subset of documents
Date Fri, 12 Dec 2003 13:45:53 GMT
I'm looking for a fast way (execution wise) to get a list of unique values for a field called
"partno" for all documents which have a given value for a field called "type".  This is for
adding values to a drop-down list.

What I have done so far is to build a list of document numbers for each value of the "type"
field (using TermEnum and TermDocs).  I then use TermEnum to get a list of values for the
"partno" field.  This returns values for all of the documents, so for each "partno" value
I use TermDocs to get a list of document numbers for the "partno" and then cross check this
list against the document list for "type".  If at least one of the "partno" documents is in
the "type" list then I know that the "partno" value is valid for that "type" value.

This works ok, but is not as fast as I would like it to be.  Any ideas on how to improve it?

Here is some sample code:

// build list of document numbers for each value of the type field
TermEnum te = reader.terms(new Term("type", ""));
typeMap = new HashMap();
while (te.term() != null && "type".equals(te.term().field())) {
  String type = te.term().text();
  TermDocs docs = reader.termDocs(new Term("type", type));
  ArrayList typelist = new ArrayList();
  while (docs.next()) {
    String docnum = String.valueOf(docs.doc());
    typelist.add(docnum);
  }
  typeMap.put(type, doclist);
  te.next(); 
}

// add partno values to dropdown for type-1
ArrayList doclist = (ArrayList)typeMap.get("type-1");
if (doclist != null) {
    // get unique for values for the partno field from the index
    TermEnum te = reader.terms(new Term("partno", ""));
    while (te.term() != null && "partno".equals(te.term().field())) {
        String value = te.term().text();
        // check that at least one document with the "partno" value is in the list for type-1
        TermDocs docs = reader.termDocs(new Term("partno", value));
        while (docs.next()) {
            String docnum = String.valueOf(docs.doc());
            if (doclist.contains(docnum)) {
                // add value to drop down
                cbo.add(value.toUpperCase());
                break;
            }
        }
        te.next();            
    }
}



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message