lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lstusr 5u93n4 <>
Subject solr reads whole index on startup
Date Wed, 05 Dec 2018 17:53:36 GMT
Hi All,

We have a collection:
  - solr 7.5
  - 3 shards, replication factor 2 for a total of 6 NRT replicas
  - 3 servers, 16GB ram each
  - 2 billion documents
  - autoAddReplicas: false
  - 2.1 TB on-disk index size
  - index stored on hdfs on separate servers.

If we (gracefully) shut down solr on all 3 servers, when we re-launch solr
we notice that the nodes go into "Recovering" state for about 10-12 hours
before finally coming alive.

During this recovery time, we notice high network traffic outbound from our
HDFS servers to our solr servers. The sum total of which is roughly
equivalent to the index size on disk.

So it seems to us that on startup, solr has to re-read the entire index
before coming back alive.

1. is this assumption correct?
2. is there any way to mitigate this, so that solr can launch faster?



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message