lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dawid Weiss <dawid.we...@gmail.com>
Subject Re: Memory leak?? with CloseableThreadLocal with use of Snowball Filter
Date Thu, 02 Aug 2012 06:34:29 GMT
http://static1.blip.pl/user_generated/update_pictures/1758685.jpg

On Thu, Aug 2, 2012 at 8:32 AM, roz dev <rozdev29@gmail.com> wrote:
> wow!! That was quick.
>
> Thanks a ton.
>
>
> On Wed, Aug 1, 2012 at 11:07 PM, Simon Willnauer
> <simon.willnauer@gmail.com>wrote:
>
>> On Thu, Aug 2, 2012 at 7:53 AM, roz dev <rozdev29@gmail.com> wrote:
>> > Thanks Robert for these inputs.
>> >
>> > Since we do not really Snowball analyzer for this field, we would not use
>> > it for now. If this still does not address our issue, we would tweak
>> thread
>> > pool as per eks dev suggestion - I am bit hesitant to do this change yet
>> as
>> > we would be reducing thread pool which can adversely impact our
>> throughput
>> >
>> > If Snowball Filter is being optimized for Solr 4 beta then it would be
>> > great for us. If you have already filed a JIRA for this then please let
>> me
>> > know and I would like to follow it
>>
>> AFAIK Robert already created and issue here:
>> https://issues.apache.org/jira/browse/LUCENE-4279
>> and it seems fixed. Given the massive commit last night its already
>> committed and backported so it will be in 4.0-BETA.
>>
>> simon
>> >
>> > Thanks again
>> > Saroj
>> >
>> >
>> >
>> >
>> >
>> > On Wed, Aug 1, 2012 at 8:37 AM, Robert Muir <rcmuir@gmail.com> wrote:
>> >
>> >> On Tue, Jul 31, 2012 at 2:34 PM, roz dev <rozdev29@gmail.com> wrote:
>> >> > Hi All
>> >> >
>> >> > I am using Solr 4 from trunk and using it with Tomcat 6. I am noticing
>> >> that
>> >> > when we are indexing lots of data with 16 concurrent threads, Heap
>> grows
>> >> > continuously. It remains high and ultimately most of the stuff ends
up
>> >> > being moved to Old Gen. Eventually, Old Gen also fills up and we start
>> >> > getting into excessive GC problem.
>> >>
>> >> Hi: I don't claim to know anything about how tomcat manages threads,
>> >> but really you shouldnt have all these objects.
>> >>
>> >> In general snowball stemmers should be reused per-thread-per-field.
>> >> But if you have a lot of fields*threads, especially if there really is
>> >> high thread churn on tomcat, then this could be bad with snowball:
>> >> see eks dev's comment on
>> https://issues.apache.org/jira/browse/LUCENE-3841
>> >>
>> >> I think it would be useful to see if you can tune tomcat's threadpool
>> >> as he describes.
>> >>
>> >> separately: Snowball stemmers are currently really ram-expensive for
>> >> stupid reasons.
>> >> each one creates a ton of Among objects, e.g. an EnglishStemmer today
>> >> is about 8KB.
>> >>
>> >> I'll regenerate these and open a JIRA issue: as the snowball code
>> >> generator in their svn was improved
>> >> recently and each one now takes about 64 bytes instead (the Among's
>> >> are static and reused).
>> >>
>> >> Still this wont really "solve your problem", because the analysis
>> >> chain could have other heavy parts
>> >> in initialization, but it seems good to fix.
>> >>
>> >> As a workaround until then you can also just use the "good old
>> >> PorterStemmer" (PorterStemFilterFactory in solr).
>> >> Its not exactly the same as using Snowball(English) but its pretty
>> >> close and also much faster.
>> >>
>> >> --
>> >> lucidimagination.com
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message