lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <s...@elyograg.org>
Subject Re: maximum number of shards per SolrCloud
Date Tue, 22 Apr 2014 16:43:01 GMT
On 4/22/2014 10:02 AM, yypvsxf19870706 wrote:
>      I am curious of the influences when have more than 2G docs in a core.And we plan
to have  5g docs/core.
>
>     Please give me some suggestion about how to plan num of docs in a core ?

One Solr core contains one Lucene index.  It can't be divided further 
than that without a significant redesign.  Quick note: Although 
SolrCloud can handle five billion documents with no problem, you can't 
have five billion documents in a single shard/core.

The only hard limitation in the entire system is that you can't have 
more than approximately 2 billion documents in a single Lucene index.  
This is because a Java integer (which is a signed 32-bit number) is what 
gets used for internal Lucene document identifiers. Deleted documents 
count against that limit.  It is theoretically possible to overcome this 
limitation, but it would be a MAJOR change to Lucene, requiring major 
changes in Solr as well.

The other limitations you can run into with a large SolrCloud are mostly 
a matter of configuration, system resources, and scaling to multiple 
servers.  They are not hard limitations in the software.

I would never put more than about 1 billion documents in a single core.  
For performance reasons, it would be a good idea to never exceed a few 
hundred million.  When a high query rate is required, loading only one 
Solr core per server may be a requirement.

Thanks,
Shawn


Mime
View raw message