lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From roz dev <rozde...@gmail.com>
Subject Re: Memory leak?? with CloseableThreadLocal with use of Snowball Filter
Date Thu, 02 Aug 2012 06:32:29 GMT
wow!! That was quick.

Thanks a ton.


On Wed, Aug 1, 2012 at 11:07 PM, Simon Willnauer
<simon.willnauer@gmail.com>wrote:

> On Thu, Aug 2, 2012 at 7:53 AM, roz dev <rozdev29@gmail.com> wrote:
> > Thanks Robert for these inputs.
> >
> > Since we do not really Snowball analyzer for this field, we would not use
> > it for now. If this still does not address our issue, we would tweak
> thread
> > pool as per eks dev suggestion - I am bit hesitant to do this change yet
> as
> > we would be reducing thread pool which can adversely impact our
> throughput
> >
> > If Snowball Filter is being optimized for Solr 4 beta then it would be
> > great for us. If you have already filed a JIRA for this then please let
> me
> > know and I would like to follow it
>
> AFAIK Robert already created and issue here:
> https://issues.apache.org/jira/browse/LUCENE-4279
> and it seems fixed. Given the massive commit last night its already
> committed and backported so it will be in 4.0-BETA.
>
> simon
> >
> > Thanks again
> > Saroj
> >
> >
> >
> >
> >
> > On Wed, Aug 1, 2012 at 8:37 AM, Robert Muir <rcmuir@gmail.com> wrote:
> >
> >> On Tue, Jul 31, 2012 at 2:34 PM, roz dev <rozdev29@gmail.com> wrote:
> >> > Hi All
> >> >
> >> > I am using Solr 4 from trunk and using it with Tomcat 6. I am noticing
> >> that
> >> > when we are indexing lots of data with 16 concurrent threads, Heap
> grows
> >> > continuously. It remains high and ultimately most of the stuff ends up
> >> > being moved to Old Gen. Eventually, Old Gen also fills up and we start
> >> > getting into excessive GC problem.
> >>
> >> Hi: I don't claim to know anything about how tomcat manages threads,
> >> but really you shouldnt have all these objects.
> >>
> >> In general snowball stemmers should be reused per-thread-per-field.
> >> But if you have a lot of fields*threads, especially if there really is
> >> high thread churn on tomcat, then this could be bad with snowball:
> >> see eks dev's comment on
> https://issues.apache.org/jira/browse/LUCENE-3841
> >>
> >> I think it would be useful to see if you can tune tomcat's threadpool
> >> as he describes.
> >>
> >> separately: Snowball stemmers are currently really ram-expensive for
> >> stupid reasons.
> >> each one creates a ton of Among objects, e.g. an EnglishStemmer today
> >> is about 8KB.
> >>
> >> I'll regenerate these and open a JIRA issue: as the snowball code
> >> generator in their svn was improved
> >> recently and each one now takes about 64 bytes instead (the Among's
> >> are static and reused).
> >>
> >> Still this wont really "solve your problem", because the analysis
> >> chain could have other heavy parts
> >> in initialization, but it seems good to fix.
> >>
> >> As a workaround until then you can also just use the "good old
> >> PorterStemmer" (PorterStemFilterFactory in solr).
> >> Its not exactly the same as using Snowball(English) but its pretty
> >> close and also much faster.
> >>
> >> --
> >> lucidimagination.com
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message