lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Searching in multiple indexes with more than 2147483519 documents
Date Thu, 08 Dec 2016 13:53:51 GMT
You can search each separately yourself and then use the TopDocs.merge
API to merge sort the results.  That API can handle > 2.1 B documents,
and each ScoreDoc hit references the shardIndex so you know which of
your indices to go back to e.g. to load stored fields.

Mike McCandless

http://blog.mikemccandless.com


On Thu, Dec 8, 2016 at 8:50 AM, Andres de la Peña <adelapena@stratio.com> wrote:
> Hi all,
>
> A Lucene index can't contain more than 2147483519 documents, so we want to
> split a larger dataset in multiple indexes. However, it is not possible to
> create a MultiReader to search in all the index partitions at a time:
>
> Too many documents: composite IndexReaders cannot exceed 2147483519 but
> readers have total maxDoc=2171401446
>
>
> What do you think is the best way to search in several indexes containing
> more than 2147483519 documents in total? Maybe searching in each index and
> merging the results in a MemoryIndex/RAMIndex?
>
> Thanks,
>
> --
> Andrés de la Peña
>
> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> 28224 Pozuelo de Alarcón, Madrid
> Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
> <https://twitter.com/StratioBD>*

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message