Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@lucene.apache.org
Date: Thu, 25 Oct 2012 21:05:13 +0000 (UTC)
From: "Erick Erickson (JIRA)" <jira@apache.org>
To: dev@lucene.apache.org
Message-ID: <1008586634.29383.1351199113313.JavaMail.jiratomcat@arcas>
Subject: [jira] [Commented] (SOLR-1293) Support for large no:of cores and
 faster loading/unloading of cores
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/SOLR-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484479#comment-13484479 ] 

Erick Erickson commented on SOLR-1293:
--------------------------------------

I've implemented some parts of this (SOLR-880, SOLR-1028), I should be checking them in sometime relatively soon, then on to some other JIRAs related to this one. But I got to thinking that maybe what we really want is two new characteristics for cores, call the loadOnStartup(T|F, default T) and sticky(T|F, default T). 

What I've done so far conflates the two ideas; things loaded "lazily" are assumed to be NOT sticky and there's really no reason to conflate them. Use cases are

LOS=T, STICKY=T - really, what we have now. Pay the penalty on startup for loading the core at startup in exchange for speed later.

LOS=T, STICKY=F - load on startup, but allow the core to be automatically unloaded later. For preloading expected 'hot' cores. Cores are unloaded on an LRU basis. NOTE: a core can be unloaded and then loaded again later if it's referenced.

LOS=F, STICKY=T - Defer loading the core, but once it's loaded, keep it loaded. Get's us started fast, amortizes loading the core. This one I actually expect to be the least useful, but it's a consequence of the others and doesn't cost anything extra to implement coding-wise.

LOS=F, STICKY=F - what I was originally thinking of as "lazy loading". Cores get loaded when first referenced, and swapped out on an LRU algorithm.

Looking at what I've done on the two JIRA's mentioned, this is actually not at all difficult, just a matter of putting the CoreConfig in the right list...

So, if any STICKY=F is found, there's a LRU cache created (actually a LinkedHashMap with removeEldestEntry overridden), with an optional size specified in the <cores...> tag. I'd guess I'll default it to 100 or some such if (and only if) there's at least one STICKY=F defined but no cache size in <cores...>. Of course if the user defined cacheSize in <cores...>, I'd allocate the cache up front.

Thoughts?
                
> Support for large no:of cores and faster loading/unloading of cores
> -------------------------------------------------------------------
>
>                 Key: SOLR-1293
>                 URL: https://issues.apache.org/jira/browse/SOLR-1293
>             Project: Solr
>          Issue Type: New Feature
>          Components: multicore
>            Reporter: Noble Paul
>             Fix For: 4.1
>
>         Attachments: SOLR-1293.patch
>
>
> Solr , currently ,is not very suitable for a large no:of homogeneous cores where you require fast/frequent loading/unloading of cores . usually a core is required to be loaded just to fire a search query or to just index one document
> The requirements of such a system are.
> * Very efficient loading of cores . Solr cannot afford to read and parse and create Schema, SolrConfig Objects for each core each time the core has to be loaded ( SOLR-919 , SOLR-920)
> * START STOP core . Currently it is only possible to unload a core (SOLR-880)
> * Automatic loading of cores . If a core is present and it is not loaded and a request comes for that load it automatically before serving up a request
> * As there are a large no:of cores , all the cores cannot be kept loaded always. There has to be an upper limit beyond which we need to unload a few cores (probably the least recently used ones)
> * Automatic allotment of dataDir for cores. If the no:of cores is too high al the cores' dataDirs cannot live in the same dir. There is an upper limit on the no:of dirs you can create in a unix dir w/o affecting performance

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org