Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 01FE1C993 for ; Wed, 6 Jun 2012 23:39:26 +0000 (UTC) Received: (qmail 30728 invoked by uid 500); 6 Jun 2012 23:39:24 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 30682 invoked by uid 500); 6 Jun 2012 23:39:24 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 30658 invoked by uid 99); 6 Jun 2012 23:39:24 -0000 Received: from issues-vm.apache.org (HELO issues-vm) (140.211.11.160) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Jun 2012 23:39:24 +0000 Received: from isssues-vm.apache.org (localhost [127.0.0.1]) by issues-vm (Postfix) with ESMTP id D460914285F for ; Wed, 6 Jun 2012 23:39:23 +0000 (UTC) Date: Wed, 6 Jun 2012 23:39:23 +0000 (UTC) From: "Greg Bowyer (JIRA)" To: dev@lucene.apache.org Message-ID: <1819434952.45808.1339025963872.JavaMail.jiratomcat@issues-vm> In-Reply-To: <80261927.45182.1339019603279.JavaMail.jiratomcat@issues-vm> Subject: [jira] [Commented] (SOLR-3514) WeakHashMap in FileFloatSource's cache only cleaned by GC MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/SOLR-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290592#comment-13290592 ] Greg Bowyer commented on SOLR-3514: ----------------------------------- One thing I will add to this, although I dont reject the idea of better cache management is that there are issues with reference processing in some JVMS relating to some configurations of GC's This was fixed for release in 1.7.0_04 here http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/f1391adc6681 Essentially the fix means that references are not traced as part of normal marking, only during big bad stop the world GC's > WeakHashMap in FileFloatSource's cache only cleaned by GC > --------------------------------------------------------- > > Key: SOLR-3514 > URL: https://issues.apache.org/jira/browse/SOLR-3514 > Project: Solr > Issue Type: Bug > Components: search > Affects Versions: 3.6, 4.0 > Reporter: Gregg Donovan > Priority: Minor > Labels: patch > Attachments: SOLR-3514.patch > > > We've encountered GC spikes at Etsy after adding new ExternalFileFields a decent number of times. I was always a little confused by this behavior -- isn't it just one big float[]? why does that cause problems for the GC? -- but looking at the FileFloatSource code a little more carefully, I wonder if this is due to using a WeakHashMap that is only cleaned by GC or manual invocation of a > request handler. > FileFloatSource stores a WeakHashMap keyed by {{IndexReader}}. In the [code|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/search/function/FileFloatSource.java?revision=1310219&view=markup#l135], it mentions that the implementation is modeled after FieldCache. However, the FieldCacheImpl [adds listeners for IndexReader close events and uses those to purge its caches|http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/search/FieldCacheImpl.java?revision=1342751&view=markup#l166]. Should we be doing the same in FileFloatSource? > Attached is a mostly untested patch with a possible implementation. There are probably better ways to do it (e.g. I don't love using another WeakHashMap), but I found it tough to hook into the IndexReader lifecycle without a) relying on classes other than FileFloatSource b) changing the public API of FIleFloatSource or c) changing the implementation too much. > There is a RequestHandler inside of FileFloatSource -- [ReloadCacheRequestHandler|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/search/function/FileFloatSource.java?revision=1310219&view=markup#l303] -- that can be used to clear the cache > entirely, but this is sub-optimal for us for a few reasons: > * It clears the entire cache. ExternalFileFields often take some > non-trivial time to load and we prefer to do so during SolrCore > warmups. Clearing the entire cache while serving traffic would likely > cause user-facing requests to timeout. > * It forces an extra commit with its consequent cache cycling, etc.. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org