lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrzej Bialecki (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-11882) SolrMetric registries retain references to SolrCores when closed
Date Thu, 01 Feb 2018 16:20:00 GMT

    [ https://issues.apache.org/jira/browse/SOLR-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16348833#comment-16348833
] 

Andrzej Bialecki  commented on SOLR-11882:
------------------------------------------

It turns out that this fix is wrong... :(

The new section in {{SolrCoreMetricManager.close()}} causes the new instances of gauges to
be closed because the new core is registered first (and registers new instances of metrics)
and only then the old one is closed - and it closes the new metrics instead of the old ones…

One solution, which is more complicated than I’d like, is to use a subclass of Gauge that
has a tag (the same as we do with MetricReporters) and remove instances only when the tag
matches the one in the core that is being closed
or revert this fix and see if there’s something better that we could do here.

> SolrMetric registries retain references to SolrCores when closed
> ----------------------------------------------------------------
>
>                 Key: SOLR-11882
>                 URL: https://issues.apache.org/jira/browse/SOLR-11882
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: metrics, Server
>    Affects Versions: 7.1
>            Reporter: Eros Taborelli
>            Assignee: Erick Erickson
>            Priority: Major
>             Fix For: 7.3
>
>         Attachments: SOLR-11882.patch, SOLR-11882.patch, SOLR-11882.patch, SOLR-11882.patch,
create-cores.zip, solr-dump-full_Leak_Suspects.zip, solr.config.zip
>
>
> *Description:*
> Our setup involves using a lot of small cores (possibly hundred thousand), but working
only on a few of them at any given time.
> We already followed all recommendations in this guide: [https://wiki.apache.org/solr/LotsOfCores]
> We noticed that after creating/loading around 1000-2000 empty cores, with no documents
inside, the heap consumption went through the roof despite having set transientCacheSize to
only 64 (heap size set to 12G).
> All cores are correctly set to loadOnStartup=false and transient=true, and we have verified
via logs that the cores in excess are actually being closed.
> However, a reference remains in the org.apache.solr.metrics.SolrMetricManager#registries
that is never removed until a core if fully unloaded.
> Restarting the JVM loads all cores in the admin UI, but doesn't populate the ConcurrentHashMap
until a core is actually fully loaded.
> I reproduced the issue on a smaller scale (transientCacheSize = 5, heap size = 512m)
and made a report (attached) using eclipse MAT.
> *Desired outcome:*
> When a transient core is closed, the references in the SolrMetricManager should be removed,
in the same fashion the reporters for the core are also closed and removed.
> In alternative, a unloadOnClose=true|false flag could be implemented to fully unload
a transient core when closed due to the cache size.
> *Note:*
> The documentation mentions everywhere that the unused cores will be unloaded, but it's
misleading as the cores are never fully unloaded.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message