lucenenet-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Omri Suissa <omri.sui...@diffdoof.com>
Subject Re: How to get all documents that fits a query?
Date Tue, 04 Dec 2012 18:32:11 GMT
Hi,
Thanks!

*Omri Suissa     **VP R&D*

*Tel:    +972 9 7724228                         **DiffDoof .ltd**
            *

*Cell:   +972 54 5395206                       **11, Galgaley Haplada
Street, *

*Fax:   +972 9 9512577**                         P.O.Box 2150***

*www.DiffDoof.com* <http://www.DiffDoof.com>*                              *
*Herzlia Pituach 46120, Israel*



On Tue, Dec 4, 2012 at 6:42 PM, Simon Svensson <sisve@devhost.se> wrote:

>  Hi,
>
> You could build a custom collector that does this by reading domain ids in
> the Collect method. You wouldn't hit an OutOfMemoryException if you avoid
> reading all hits into an array (or other storage), but processing the hits
> "as they come".
>
> Example using DelegatingCollector<https://github.com/devhost/Corelicious/blob/master/Corelicious.Lucene/DelagatingCollector.cs>
>
> var collector = new DelegatingCollector((reader, id) => {
>     var document = reader.Document(id);
>     // Do something with your document.
> });
> searcher.Search(query, collector);
>
> On 2012-12-04 17:35, Omri Suissa wrote:
>
>   Hi,
>
> I want to enumerate all the documents in the index that fits a specific
> query (let's say "cat") and perform a task in my database.
>
> When I search I need to give a collector, my common search method using
> TopScoreDocCollector that gets numHits in the Create method.
>
> I don't want to limit the amount to documents (if there is 10M documents I
> want to get all 10M results) so I use int.MaxValue but then I get out of
> memory exception.
>
> Lucene doesn't support paging so I can't ask X documents every time
> starting form document Y (because when I want to get the 20th document I
> need to get the first one first).
>
> What can I do?
>
>
>
> Thanks,
>
> Omri
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message