lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "SolrCloud" by Per Steffensen
Date Fri, 30 Nov 2012 13:42:01 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "SolrCloud" page has been changed by Per Steffensen:
http://wiki.apache.org/solr/SolrCloud?action=diff&rev1=79&rev2=80

Comment:
a little more details about create operation of Collections API - and preparing descriptions
covering SOLR-4114 and SOLR-4120

   1. If you do colocate ZooKeeper with Solr, using separate disk drives for Solr and ZooKeeper
will help with performance.
  
  == Managing collections via the Collections API ==
- The collections API let's you manage collections. Under the hood, it generally uses the
CoreAdmin API to manage SolrCores on each server - it's essentially sugar for actions that
you could handle yourself if you made individual CoreAdmin API calls to each server you wanted
an action to take place on.
+ The collections API let's you manage collections. Under the hood, it generally uses the
CoreAdmin API to asynchronously (though Overseer) manage SolrCores on each server - it's essentially
sugar for actions that you could handle yourself if you made individual CoreAdmin API calls
to each server you wanted an action to take place on.
  
  Create http://localhost:8983/solr/admin/collections?action=CREATE&name=mycollection&numShards=3&replicationFactor=4
+ 
+ About the params
+  * '''name''': The name of the collection to be created
+  * '''numShards''': The number of slices (sometimes called shards) to be created as part
of the collection
+  * '''replicationFactor''': The number of "additional" shards (sometimes called replica)
to be created for each slice. Set it to 0 to have "one shard for each of your slices". Set
to 1 to have "two shards for each of your slices" etc. With a value of 0 your data will not
be replicated
+  * '''maxShardsPerNode''' (not in 4.0.0 and not even committet yet - see SOLR-4114): A create
operation will spread numShards*(replicationFactor+1) shards across your live Solr nodes -
fairly distributed, and never two shards of the same slice on the same Solr node. If a Solr
is not live at the point in time where the create operation is carried out, it will not get
any shards of the new collection. To prevent too many shards being created on a single Solr
node, use maxShardsPerNode to set a limit for how many shards the create operation is allowed
to create on each node - default is 1. If it cannot fit the entire collection (numShards*(replicationFactor+1)
shards) on you live Solrs it will not create anything at all. Unfortunately, since the create
operation is carried out asynchronously, you will not get any feedback about a decission to
not create the collection.
+  * '''createNodeSet''' (not in 4.0.0 and not even committet yet - see SOLR-4120): If not
provided the create operation will create shards spread across all of your live Solr nodes.
You can provide the "createNodeSet" parameter to change the set of nodes to spread the shards
across. The format of values for this param is "<node-name1>,<node-name2>,...,<node-nameN>"
- e.g. "localhost:8983_solr,localhost:8984_solr,localhost:8985_solr"
  
  Note: replicationFactor defines the maximum number of replicas created in addition to the
leader from amongst the nodes currently running (i.e. nodes added later will not be used for
this collection). Imagine you have a cluster with 20 nodes and want to add an additional smaller
collection to your installation with 2 shards, each shard with a leader and two replicas.
You would specify a replicationFactor=2. Now six of your nodes will host this new collection
and the other 14 will not host the new collection.
  

Mime
View raw message