lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: What is the fastest way to loop over all documents in an index?
Date Tue, 05 Sep 2017 20:54:46 GMT
You can call MultiFields.getLiveDocs(IndexReader) to get the bitset
identifying which documents are not deleted.

Mike McCandless

http://blog.mikemccandless.com

On Tue, Sep 5, 2017 at 2:54 PM, Mikhail Khludnev <mkhl@apache.org> wrote:

> You can call searcher.search() with MatchAlldocsQuery passing own collector
> impl which will be notified about every non-deleted doc via collect(docId).
>
> On Tue, Sep 5, 2017 at 3:09 AM, Jean Claude van Johnson <
> vanjohnsonjeanclaude@gmail.com> wrote:
>
> > Hi there,
> >
> > I have an use case, were I need to iterate over all documents in an index
> > from time to time.
> > It seems that the MatchAllDocsQuery is what I should use for this,
> however
> > it creates a bunch of Objects (Score etc) that I don’t really need.
> >
> > My question to you is:
> >
> > What is the fastest way to loop over all documents in an index?
> > Is it looping over all possible doc id’s (+filtering out deleted
> > documents)?
> >
> > Thank you very much.
> >
> > Best regards
> > Claude
> >
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message