lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean Claude van Johnson <vanjohnsonjeancla...@gmail.com>
Subject Re: What is the fastest way to loop over all documents in an index?
Date Thu, 07 Sep 2017 08:47:40 GMT
Many thanks for your answers!

Cheers
Claude

> On 5 Sep 2017, at 21:54, Michael McCandless <lucene@mikemccandless.com> wrote:
> 
> You can call MultiFields.getLiveDocs(IndexReader) to get the bitset
> identifying which documents are not deleted.
> 
> Mike McCandless
> 
> http://blog.mikemccandless.com
> 
> On Tue, Sep 5, 2017 at 2:54 PM, Mikhail Khludnev <mkhl@apache.org> wrote:
> 
>> You can call searcher.search() with MatchAlldocsQuery passing own collector
>> impl which will be notified about every non-deleted doc via collect(docId).
>> 
>> On Tue, Sep 5, 2017 at 3:09 AM, Jean Claude van Johnson <
>> vanjohnsonjeanclaude@gmail.com> wrote:
>> 
>>> Hi there,
>>> 
>>> I have an use case, were I need to iterate over all documents in an index
>>> from time to time.
>>> It seems that the MatchAllDocsQuery is what I should use for this,
>> however
>>> it creates a bunch of Objects (Score etc) that I don’t really need.
>>> 
>>> My question to you is:
>>> 
>>> What is the fastest way to loop over all documents in an index?
>>> Is it looping over all possible doc id’s (+filtering out deleted
>>> documents)?
>>> 
>>> Thank you very much.
>>> 
>>> Best regards
>>> Claude
>>> 
>>> 
>> 
>> 
>> --
>> Sincerely yours
>> Mikhail Khludnev
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message