lucenenet-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Svensson <si...@devhost.se>
Subject Re: How to get all documents that fits a query?
Date Tue, 04 Dec 2012 16:42:58 GMT
Hi,

You could build a custom collector that does this by reading domain ids 
in the Collect method. You wouldn't hit an OutOfMemoryException if you 
avoid reading all hits into an array (or other storage), but processing 
the hits "as they come".

Example using DelegatingCollector 
<https://github.com/devhost/Corelicious/blob/master/Corelicious.Lucene/DelagatingCollector.cs>



|var collector = new DelegatingCollector((reader, id) => {
     var document = reader.Document(id);
     // Do something with your document.
});
searcher.Search(query, collector);|

On 2012-12-04 17:35, Omri Suissa wrote:

> Hi,
>
> I want to enumerate all the documents in the index that fits a specific
> query (let's say "cat") and perform a task in my database.
>
> When I search I need to give a collector, my common search method using
> TopScoreDocCollector that gets numHits in the Create method.
>
> I don't want to limit the amount to documents (if there is 10M documents I
> want to get all 10M results) so I use int.MaxValue but then I get out of
> memory exception.
>
> Lucene doesn't support paging so I can't ask X documents every time
> starting form document Y (because when I want to get the 20th document I
> need to get the first one first).
>
> What can I do?
>
>
>
> Thanks,
>
> Omri
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message