lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "LotsOfCores" by ShalinMangar
Date Fri, 24 Jul 2009 11:34:10 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The following page has been changed by ShalinMangar:
http://wiki.apache.org/solr/LotsOfCores

The comment on the change is:
Added issues, configuration and other details

------------------------------------------------------------------------------
  <!> ["Solr1.5"]
  
  [[TableOfContents]]
+ 
+ = Overview =
  
  Solr, currently, is not very suitable for a large number of homogeneous cores where you
require fast/frequent loading/unloading of cores. Usually a core is required to be loaded
just to fire a search query or to just index one document.
  
@@ -13, +15 @@

   1. LRU Core Loading/Unloading - As there are a large number of cores, all the cores cannot
be kept loaded always. There has to be an upper limit beyond which we need to unload a few
cores.
   1. Automatic allotment of dataDir for cores - If the number of cores is too high, all the
cores' dataDirs cannot live in the same directory. There is an upper limit on the number of
directories you can create in a directory w/o affecting performance.
  
+ = Issues =
+  * [https://issues.apache.org/jira/browse/SOLR-1293 SOLR-1293] - Support for large number
of cores and faster loading/unloading of cores. This issue has many child issues focusing
on individual changes:
+   * [https://issues.apache.org/jira/browse/SOLR-919 SOLR-919] - Cache and reuse SolrConfig
+   * [https://issues.apache.org/jira/browse/SOLR-920 SOLR-920] - Cache and reuse IndexSchema
+   * [https://issues.apache.org/jira/browse/SOLR-921 SOLR-921] - SolrResourceLoader must
cache short name vs fully qualified name
+   * [https://issues.apache.org/jira/browse/SOLR-880 SOLR-880] - SolrCore should have a STOP
option and a lazy startup option
+   * [https://issues.apache.org/jira/browse/SOLR-1108 SOLR-1108] - Remove synchronization
in SolrCore constructor
+   * [https://issues.apache.org/jira/browse/SOLR-1028 SOLR-1028] - Automatic core loading
unloading for multicore
+   * [https://issues.apache.org/jira/browse/SOLR-943 SOLR-943] - Make it possible to specify
dataDir in solr.xml
+   * [https://issues.apache.org/jira/browse/SOLR-1306 SOLR-1306] - Support pluggable persistence/loading
of solr.xml details
+   * [https://issues.apache.org/jira/browse/SOLR-1106 SOLR-1106] - Pluggable CoreAdminHandler
(Action ) architecture that allows for custom handler access to CoreContainer / request-response
+ 
+ Other features which may be needed for such a system include:
+  * Provide a way to partition data directories into multiple "bucket" directories. For example,
instead of creating 10,000 data directories inside one base data directory, Solr can assign
a core to one of 4 base directories, thereby distributing them.
+  * Provide a way to completely remove a core - Currently the unload command keeps the core's
data on disk even though the details of the core is deleted from configuration. Solr can have
an option of cleaning the data directory on unload of a core
+  * Changes to SolrJ for new start/stop commands and better error codes/messages.
+ 
+ = Configuration =
+ 
+ The following configuration applies to the patch given in [https://issues.apache.org/jira/browse/SOLR-1293
SOLR-1293].
+ 
+ {{{
+ <?xml version='1.0' encoding='UTF-8'?>
+ <solr persistent='true'>
+   <cores adminPath="/admin/cores"
+           maxCores="4"
+           adminHandler="org.apache.solr.handler.admin.LotsOfCoresAdminHandler"
+           shareSchema="true"
+           shareConfig="true"
+           baseDataDir="/opt/solr/data"
+           numBuckets="4"
+           commonInstanceDir="/opt/solr"
+           cleanOnUnload="true">
+     <core name="core0" instanceDir="/opt/solr" loadOnStart="false"/>
+   </cores>
+ </solr>
+ }}}
+ 
+  * '''maxCores''' - Maximum number of cores to be loaded at any given point in time. If
this limit is crossed, the least recently used core is stopped and the new one is started.
+  * '''adminHandler''' - Value should be fixed as in the above example. The adminHandler
is pluggable in Solr now.
+  * '''shareSchema''' - Ensures that only one instance of IndexSchema is created in the Solr
+  * '''shareConfig''' - Ensures that only one instance of SolrConfig is created in the Solr
+  * '''baseDataDir''' - This is the place where the indexes are created. There is no need
to pass the dataDir as an request parameter. Solr automatically assigns a data directory for
 that core in this base directory
+  * '''numBuckets''' - This shows the number of buckets created in 'baseDataDir'. A core
will be assigned into one of the buckets randomly. Keep it '0' or omit this attribute if buckets
are not required
+  * '''commonInstanceDir''' - This can be the default instanceDir for all the cores created.
The 'instanceDir' parameter can be omitted while creating a core if this attribute has been
specified in solr.xml
+  * '''cleanOnUnload''' - Clean up (delete) the index when a core is unloaded.
+ 
+ 
+ With the above configuration, the only parameter required for creating a core is the core
name.
+ 
+ = New CoreAdmin Commands =
+ 
+ LotsOfCoresAdminHandler supports two new core admin commands:
+ 
+  * start - If a core is stopped it can be started using this command
+  * stop - if a core is running it can be stopped 
+ 
+ Example: http://host:80/admin/cores?action=start
+ 
+ = Further work =
+  * Alias/Unalias commands are not fully tested with this patch. In particular, aliases are
not persisted for cores.
+  * We highly recommend that the 'alias' feature in Solr not be used due to the high synchronization
overhead it brings.
+  * Alternatively, we should work towards reducing the synchronization involved
+ 

Mime
View raw message