lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: TopDocs - Get all docs?
Date Mon, 08 Dec 2008 14:28:16 GMT
is empid indexed? If it is this should run *much* faster if you used
TermEnum/TermDocs to fetch all the empids......

FWIW
Erick

On Mon, Dec 8, 2008 at 9:17 AM, Donna L Gresh <gresh@us.ibm.com> wrote:

> I have a need to get the list of all "empid"s (defined by me) in the index
> so that I can remove the ones that are "stale" by my definition; in this
> snippet I'm returning all the "empids" for later processing, but the core
> is very simple.
>
>        public Vector getIndexIds() throws Exception {
>
>                Vector vec = new Vector();
>                IndexReader ireader = IndexReader.open(directoryName);
>                int numdocs = ireader.numDocs();
>                for (int i=0; i<numdocs; i++) {
>                        Document doc = ireader.document(i);
>                        Field field = doc.getField("empid");
>                        if (field==null) {
>                                continue;
>                        }
>                        String contents = field.stringValue();
>                        vec.add(contents);
>                }
>                return vec;
>        }
>
> Donna L. Gresh
> Business Analytics and Mathematical Sciences
> IBM T.J. Watson Research Center
> (914) 945-2472
> http://www.research.ibm.com/people/g/donnagresh
> gresh@us.ibm.com
>
>
> "Ian Vink" <ianvink@gmail.com> wrote on 12/05/2008 05:57:20 PM:
>
> > Is there an easy way to get all the documents in the index?
> > Kinda like this:
> >
> > TopDocs everything = ???.GetAllDocuments();
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message