lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Arslan <iori...@yahoo.com.INVALID>
Subject Re: What is the fastest way to loop over all documents in an index?
Date Tue, 05 Sep 2017 02:27:47 GMT
Hi Jean,

I am also interested answers to this question. I need this feature too. Currently I am using
a hack.
I create an artificial field (with an artificial token) attached to every document. 

I traverse all documents using the code snippet given in my previous related question. (no
one answered to it)

http://lucene.472066.n3.nabble.com/PostingsEnum-for-documents-that-does-not-contain-a-term-td4349482.html
I found EverythingEnum class in the Lucene50PostingsReader.java, but I couldn't figure out
how to use it.
So, I do not know if this class is for the task, but its name looks promising.
Thanks,Ahmet



On Tuesday, September 5, 2017, 3:09:37 AM GMT+3, Jean Claude van Johnson <vanjohnsonjeanclaude@gmail.com>
wrote: 





Hi there,

I have an use case, were I need to iterate over all documents in an index from time to time.
It seems that the MatchAllDocsQuery is what I should use for this, however it creates a bunch
of Objects (Score etc) that I don’t really need.

My question to you is: 

What is the fastest way to loop over all documents in an index?
Is it looping over all possible doc id’s (+filtering out deleted documents)?

Thank you very much.

Best regards
Claude

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message