lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yago Riveiro <yago.rive...@gmail.com>
Subject Re: Scaling SolrCloud
Date Wed, 20 Jan 2016 14:25:51 GMT
Our Zookeeper cluster is an ensemble of 5 machines, is a good starting point,
3 are to risky, you lost one you lost quorum and with 7 sync cost increase.

  

ZK cluster is in machines without IO and rotative hdd (don't not use SDD to
gain IO performance,  zookeeper is optimized to spinning disks).

  

The ZK cluster behaves without problems, the first deploy of ZK was in the
same machines that the Solr Cluster (ZK log in its own hdd) and that didn't
wok very well, CPU and networking IO from Solr Cluster was too much.

  

About schema modifications.  
  
Modify the schema to add new fields is relative simple with new API, in the
pass all the work was manually uploading the schema to ZK and reloading all
collections (indexing must be disable or timeouts and funny errors happen).  
  
With the new Schema API this is more user friendly. Anyway, I stop indexing
and for reload the collections (I don't know if it's necessary nowadays).  
  
About Indexing data.

  

We have self made data importer, it's not java and not performs batch indexing
(with 500 collections buffer data and build the batch is expensive and
complicate for error handling).

  

We use regular HTTP post in json. Our throughput  is about 1000 docs/s without
any type of optimization. Some time we have issues with replication, the slave
can keep pace with leader insertion and a full sync is requested, this is bad
because sync the replica again implicates a lot of IO wait and CPU and with
replicas with 100G take an hour or more (normally when this happen, we disable
indexing to release IO and CPU and not kill the node with a load of 50 or 60).  
  
In this department my advice is "keep it simple" in the end is an HTTP POST to
a node of the cluster.

  

\--

/Yago Riveiro

> On Jan 20 2016, at 1:39 pm, Troy Edwards &lt;tedwards415107@gmail.com&gt;
wrote:  

>

> Thank you for sharing your experiences/ideas.

>

> Yago since you have 8 billion documents over 500 collections, can you share  
what/how you do index maintenance (e.g. add field)? And how are you loading  
data into the index? Any experiences around how Zookeeper ensemble behaves  
with so many collections?

>

> Best,

>

>  
On Tue, Jan 19, 2016 at 6:05 PM, Yago Riveiro &lt;yago.riveiro@gmail.com&gt;  
wrote:

>

> &gt; What I can say is:  
&gt;  
&gt;  
&gt; * SDD (crucial for performance if the index doesn't fit in memory, and  
&gt; will not fit)  
&gt; * Divide and conquer, for that volume of docs you will need more than 6  
&gt; nodes.  
&gt; * DocValues to not stress the java HEAP.  
&gt; * Do you will you aggregate data?, if yes, what is your max  
&gt; cardinality?, this question is the most important to size correctly the  
&gt; memory needs.  
&gt; * Latency is important too, which threshold is acceptable before  
&gt; consider a query slow?  
&gt; At my company we are running a 12 terabytes (2 replicas) Solr cluster
with  
&gt; 8  
&gt; billion documents sparse over 500 collection . For this we have about 12  
&gt; machines with SDDs and 32G of ram each (~24G for the heap).  
&gt;  
&gt; We don't have a strict need of speed, 30 second query to aggregate 100  
&gt; million  
&gt; documents with 1M of unique keys is fast enough for us, normally the  
&gt; aggregation performance decrease as the number of unique keys increase,  
&gt; with  
&gt; low unique key factor, queries take less than 2 seconds if data is in OS  
&gt; cache.  
&gt;  
&gt; Personal recommendations:  
&gt;  
&gt; * Sharding is important and smart sharding is crucial, you don't want  
&gt; run queries on data that is not interesting (this slow down queries when  
&gt; the dataset is big).  
&gt; * If you want measure speed do it with about 1 billion documents to  
&gt; simulate something real (real for 10 billion document world).  
&gt; * Index with re-indexing in mind. with 10 billion docs, re-index data  
&gt; takes months ... This is important if you don't use regular features of  
&gt; Solr. In my case I configured Docvalues with disk format (not standard  
&gt; feature in 4.x) and at some point this format was deprecated. Upgrade
Solr  
&gt; to 5.x was an epic 3 months battle to do it without full downtime.  
&gt; * Solr is like your girlfriend, will demand love and care and plenty of  
&gt; space to full-recover replicas that in some point are out of sync, happen
a  
&gt; lot restarting nodes (this is annoying with replicas with 100G), don't  
&gt; underestimate this point. Free space can save your life.  
&gt;  
&gt; \\--  
&gt;  
&gt; /Yago Riveiro  
&gt;  
&gt; &gt; On Jan 19 2016, at 11:26 pm, Shawn Heisey
&amp;lt;apache@elyograg.org&amp;gt;  
&gt; wrote:  
&gt;  
&gt; &gt;  
&gt;  
&gt; &gt; On 1/19/2016 1:30 PM, Troy Edwards wrote:  
&gt; &amp;gt; We are currently "beta testing" a SolrCloud with 2 nodes and 2
shards  
&gt; with  
&gt; &amp;gt; 2 replicas each. The number of documents is about 125000.  
&gt; &amp;gt;  
&gt; &amp;gt; We now want to scale this to about 10 billion documents.  
&gt; &amp;gt;  
&gt; &amp;gt; What are the steps to prototyping, hardware estimation and
stress  
&gt; testing?  
&gt;  
&gt; &gt;  
&gt;  
&gt; &gt; There is no general information available for sizing, because there
are  
&gt; too many factors that will affect the answers. Some of the important  
&gt; information that you need will be impossible to predict until you  
&gt; actually build it and subject it to a real query load.  
&gt;  
&gt; &gt;  
&gt;  
&gt; &gt; https://lucidworks.com/blog/sizing-hardware-in-the-abstract-why-we-
dont-  
&gt; have-a-definitive-answer/  
&gt;  
&gt; &gt;  
&gt;  
&gt; &gt; With an index of 10 billion documents, you may not be able to
precisely  
&gt; predict performance and hardware requirements from a small-scale  
&gt; prototype. You'll likely need to build a full-scale system on a small  
&gt; testbed, look for bottlenecks, ask for advice, and plan on a larger  
&gt; system for production.  
&gt;  
&gt; &gt;  
&gt;  
&gt; &gt; The hard limit for documents on a single shard is slightly less than  
&gt; Java's Integer.MAX_VALUE -- just over two billion. Because deleted  
&gt; documents count against this max, about one billion documents per shard  
&gt; is the absolute max that should be loaded in practice.  
&gt;  
&gt; &gt;  
&gt;  
&gt; &gt; BUT, if you actually try to put one billion documents in a single  
&gt; server, performance will likely be awful. A more reasonable limit per  
&gt; machine is 100 million ... but even this is quite large. You might need  
&gt; smaller shards, or you might be able to get good performance with larger  
&gt; shards. It all depends on things that you may not even know yet.  
&gt;  
&gt; &gt;  
&gt;  
&gt; &gt; Memory is always a strong driver for Solr performance, and I am
speaking  
&gt; specifically of OS disk cache -- memory that has not been allocated by  
&gt; any program. With 10 billion documents, your total index size will  
&gt; likely be hundreds of gigabytes, and might even reach terabyte scale.  
&gt; Good performance with indexes this large will require a lot of total  
&gt; memory, which probably means that you will need a lot of servers and  
&gt; many shards. SSD storage is strongly recommended.  
&gt;  
&gt; &gt;  
&gt;  
&gt; &gt; For extreme scaling on Solr, especially if the query rate will be
high,  
&gt; it is recommended to only have one shard replica per server.  
&gt;  
&gt; &gt;  
&gt;  
&gt; &gt; I have just added an "extreme scaling" section to the following wiki  
&gt; page, but it's mostly a placeholder right now. I would like to have a  
&gt; discussion with people who operate very large indexes so I can put real  
&gt; usable information in this section. I'm on IRC quite frequently in the  
&gt; #solr channel.  
&gt;  
&gt; &gt;  
&gt;  
&gt; &gt; https://wiki.apache.org/solr/SolrPerformanceProblems  
&gt;  
&gt; &gt;  
&gt;  
&gt; &gt; Thanks,  
&gt; Shawn  
&gt;  
&gt;


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message