lucene-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "SolrPerformanceProblems" by ShawnHeisey
Date Fri, 12 May 2017 18:22:22 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "SolrPerformanceProblems" page has been changed by ShawnHeisey:
https://wiki.apache.org/solr/SolrPerformanceProblems?action=diff&rev1=75&rev2=76

Comment:
Improve information about ZK.  Properly capitalize ZooKeeper. Don't link the new capitalization.

  
  === SolrCloud ===
  
- Regardless of the number of nodes or available resources, SolrCloud begins to have stability
problems when the number of collections reaches the low hundreds.  With thousands of collections,
any little problem or change to the cluster can cause a stability death spiral that may not
recover for tens of minutes.  Try to keep the number of collections as low as possible.  These
problems are due to how SolrCloud updates cluster state in zookeeper in response to cluster
changes.  Work is underway to try and improve this situation.
+ Regardless of the number of nodes or available resources, SolrCloud begins to have stability
problems when the number of collections reaches the low hundreds.  With thousands of collections,
any little problem or change to the cluster can cause a stability death spiral that may not
recover for tens of minutes.  Try to keep the number of collections as low as possible.  These
problems are due to how SolrCloud updates cluster state in !ZooKeeper in response to cluster
changes.  Work is underway to try and improve this situation.
  
- Because SolrCloud relies heavily on Zookeeper, it can be very unstable if you have underlying
performance issues that result in operations taking longer than the [[SolrCloud#SolrCloud_Instance_ZooKeeper_Params|zkClientTimeout]].
 Increasing that timeout can help, but addressing the underlying performance issues  will
yield better results.  The default timeout (15 seconds internally, and 30 seconds in most
recent example configs) is quite long and should be more than enough for a well-tuned SolrCloud
install.
+ Because SolrCloud relies heavily on !ZooKeeper, it can be very unstable if you have underlying
performance issues that result in operations taking longer than the [[SolrCloud#SolrCloud_Instance_Zookeeper_Params|zkClientTimeout]].
 Increasing that timeout can help, but addressing the underlying performance issues  will
yield better results.  The default timeout (15 seconds internally, and 30 seconds in most
recent example configs) is quite long and should be more than enough for a well-tuned SolrCloud
install.
  
- Zookeeper's design assumes that it has extremely fast read and write access to its database.
 If the Zookeeper database is stored on the same disks that hold the Solr data, any performance
problems with Solr will delay Zookeeper's access to its own database.  This can lead to a
performance death spiral where each ZK timeout results in recovery operations which cause
further timeouts.
+ !ZooKeeper's design assumes that it has extremely fast access to its database.  If the !ZooKeeper
database is stored on the same disks that hold the Solr data, any performance problems with
Solr will delay !ZooKeeper's access to its own database.  This can lead to a performance death
spiral where each ZK timeout results in recovery operations which cause further timeouts.
  
- "Extremely fast" reads and writes mean that the OS must be able to cache the database in
its disk cache.  If the disk cache is too small, the OS will have to read the disk in order
to get zookeeper data.  Disks are slow, and when there is a lot of I/O because Solr is having
performance issues, a Zookeeper read or write may get buried in the I/O scheduler queue and
take even longer to complete.  We strongly recommend storing the Zookeeper data on separate
physical disks from the Solr data.  Having dedicated machines for all ZK nodes (a minimum
of three nodes are required for redundancy) is even better, but not a requirement.
+ !ZooKeeper holds its database in Java heap memory, so disk read performance isn't quite
as critical as disk write performance.  In situations where the OS disk cache is too small
for Solr's needs and the ZK database is on the same disk as Solr data, a large amount of disk
access for Solr can interfere with ZK writes.  Using very fast disks for ZK (SSD in particular)
will result in good performance.  Using separate physical disks for Solr and ZK data is strongly
recommended.  Having dedicated machines for all ZK nodes (a minimum of three nodes are required
for redundancy) is even better, but not strictly a requirement.
  
  == RAM ==
  

Mime
View raw message