lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Donna L Gresh <gr...@us.ibm.com>
Subject Re: TopDocs - Get all docs?
Date Mon, 08 Dec 2008 14:17:17 GMT
I have a need to get the list of all "empid"s (defined by me) in the index 
so that I can remove the ones that are "stale" by my definition; in this 
snippet I'm returning all the "empids" for later processing, but the core 
is very simple.

        public Vector getIndexIds() throws Exception {
 
                Vector vec = new Vector();
                IndexReader ireader = IndexReader.open(directoryName);
                int numdocs = ireader.numDocs();
                for (int i=0; i<numdocs; i++) {
                        Document doc = ireader.document(i);
                        Field field = doc.getField("empid");
                        if (field==null) {
                                continue;
                        }
                        String contents = field.stringValue();
                        vec.add(contents);
                }
                return vec;
        }

Donna L. Gresh
Business Analytics and Mathematical Sciences 
IBM T.J. Watson Research Center
(914) 945-2472
http://www.research.ibm.com/people/g/donnagresh
gresh@us.ibm.com


"Ian Vink" <ianvink@gmail.com> wrote on 12/05/2008 05:57:20 PM:

> Is there an easy way to get all the documents in the index?
> Kinda like this:
> 
> TopDocs everything = ???.GetAllDocuments();

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message