lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jay Potharaju <jspothar...@gmail.com>
Subject solr multicore vs sharding vs 1 big collection
Date Sun, 02 Aug 2015 00:49:49 GMT
Hi

I currently have a single collection with 40 million documents and index
size of 25 GB. The collections gets updated every n minutes and as a result
the number of deleted documents is constantly growing. The data in the
collection is an amalgamation of more than 1000+ customer records. The
number of documents per each customer is around 100,000 records on average.

Now that being said, I 'm trying to get an handle on the growing deleted
document size. Because of the growing index size both the disk space and
memory is being used up. And would like to reduce it to a manageable size.

I have been thinking of splitting the data into multiple core, 1 for each
customer. This would allow me manage the smaller collection easily and can
create/update the collection also fast. My concern is that number of
collections might become an issue. Any suggestions on how to address this
problem. What are my other alternatives to moving to a multicore
collections.?

Solr: 4.9
Index size:25 GB
Max doc: 40 million
Doc count:29 million

Replication:4

4 servers in solrcloud.

Thanks
Jay

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message