Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7C59DDF1F for ; Thu, 2 Aug 2012 06:32:59 +0000 (UTC) Received: (qmail 16664 invoked by uid 500); 2 Aug 2012 06:32:57 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 16561 invoked by uid 500); 2 Aug 2012 06:32:57 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 16552 invoked by uid 99); 2 Aug 2012 06:32:57 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Aug 2012 06:32:57 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of rozdev29@gmail.com designates 209.85.213.48 as permitted sender) Received: from [209.85.213.48] (HELO mail-yw0-f48.google.com) (209.85.213.48) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Aug 2012 06:32:50 +0000 Received: by yhfq46 with SMTP id q46so9602914yhf.35 for ; Wed, 01 Aug 2012 23:32:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=lNgPlMJ96D0brehRYJ+1MsxMDKjYcUlcVR0eQEZu6+8=; b=c90AdhPB5STl3viVzkzD4z+C1xnDIPcUwVk/0l0SzQwoqVyo/IZgabuvRZNXE3J1+T Zo1tR8m1JbK8PCW6tZ9HvrSQwDRpwGi+i5l13xEqH+NM8aw5DABxSn3shNeil6laqrW7 JfGKDPbPS11P65KonZUEOxIv9BCyUgQSrjHb6PjCoREeN7Tma61dQU8EfsepdyRAqcAj mKKNE3E0Hc4lJJAGYfDCHsI6e77Dw2H6gmeq/fbaVodgydou9IXnIvSmDQ2qa2adHq/D 1i+hUYbn3oB456ESBsvB1zD/kXIRfpMhzjEVgBWlz4ZSaNENblG5a4dDO40F1Oc+s0IU iExQ== MIME-Version: 1.0 Received: by 10.42.29.4 with SMTP id p4mr2276910icc.30.1343889149251; Wed, 01 Aug 2012 23:32:29 -0700 (PDT) Received: by 10.64.10.233 with HTTP; Wed, 1 Aug 2012 23:32:29 -0700 (PDT) In-Reply-To: References: Date: Wed, 1 Aug 2012 23:32:29 -0700 Message-ID: Subject: Re: Memory leak?? with CloseableThreadLocal with use of Snowball Filter From: roz dev To: java-user@lucene.apache.org, simon.willnauer@gmail.com Content-Type: multipart/alternative; boundary=20cf303f68da5bd12704c642936b --20cf303f68da5bd12704c642936b Content-Type: text/plain; charset=ISO-8859-1 wow!! That was quick. Thanks a ton. On Wed, Aug 1, 2012 at 11:07 PM, Simon Willnauer wrote: > On Thu, Aug 2, 2012 at 7:53 AM, roz dev wrote: > > Thanks Robert for these inputs. > > > > Since we do not really Snowball analyzer for this field, we would not use > > it for now. If this still does not address our issue, we would tweak > thread > > pool as per eks dev suggestion - I am bit hesitant to do this change yet > as > > we would be reducing thread pool which can adversely impact our > throughput > > > > If Snowball Filter is being optimized for Solr 4 beta then it would be > > great for us. If you have already filed a JIRA for this then please let > me > > know and I would like to follow it > > AFAIK Robert already created and issue here: > https://issues.apache.org/jira/browse/LUCENE-4279 > and it seems fixed. Given the massive commit last night its already > committed and backported so it will be in 4.0-BETA. > > simon > > > > Thanks again > > Saroj > > > > > > > > > > > > On Wed, Aug 1, 2012 at 8:37 AM, Robert Muir wrote: > > > >> On Tue, Jul 31, 2012 at 2:34 PM, roz dev wrote: > >> > Hi All > >> > > >> > I am using Solr 4 from trunk and using it with Tomcat 6. I am noticing > >> that > >> > when we are indexing lots of data with 16 concurrent threads, Heap > grows > >> > continuously. It remains high and ultimately most of the stuff ends up > >> > being moved to Old Gen. Eventually, Old Gen also fills up and we start > >> > getting into excessive GC problem. > >> > >> Hi: I don't claim to know anything about how tomcat manages threads, > >> but really you shouldnt have all these objects. > >> > >> In general snowball stemmers should be reused per-thread-per-field. > >> But if you have a lot of fields*threads, especially if there really is > >> high thread churn on tomcat, then this could be bad with snowball: > >> see eks dev's comment on > https://issues.apache.org/jira/browse/LUCENE-3841 > >> > >> I think it would be useful to see if you can tune tomcat's threadpool > >> as he describes. > >> > >> separately: Snowball stemmers are currently really ram-expensive for > >> stupid reasons. > >> each one creates a ton of Among objects, e.g. an EnglishStemmer today > >> is about 8KB. > >> > >> I'll regenerate these and open a JIRA issue: as the snowball code > >> generator in their svn was improved > >> recently and each one now takes about 64 bytes instead (the Among's > >> are static and reused). > >> > >> Still this wont really "solve your problem", because the analysis > >> chain could have other heavy parts > >> in initialization, but it seems good to fix. > >> > >> As a workaround until then you can also just use the "good old > >> PorterStemmer" (PorterStemFilterFactory in solr). > >> Its not exactly the same as using Snowball(English) but its pretty > >> close and also much faster. > >> > >> -- > >> lucidimagination.com > >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --20cf303f68da5bd12704c642936b--