lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Possible bug in buffered deletes on trunk?
Date Mon, 25 Jul 2011 10:24:13 GMT
OK I tracked this down... indeed we have a bug in trunk (and not 3.x)
whereby buffered deletes are never flushed by RAM or count.

I have unit tests that show the problem... I'll open an issue.

Mike McCandless

http://blog.mikemccandless.com

On Sun, Jul 24, 2011 at 6:36 AM, Mike McCandless
<lucene@mikemccandless.com> wrote:
> Not good!  I'll dig once I'm back from vacation... sounds like something is up.
>
> Mike
>
> Sent from my iPad
>
> On Jul 23, 2011, at 4:24 PM, Mark Miller <markrmiller@gmail.com> wrote:
>
>> So eventually of course, after spending a few years in GC hell, you do hit the OOM.
>>
>> On Jul 23, 2011, at 10:33 AM, Mark Miller wrote:
>>
>>> Alexey Serba pointed out an issue he was seeing to me last night. He said that
when he used an older version of Solr to index millions of docs, the memory usage stayed quite
low - but with a recent trunk version, the memory usage sky rocketed. No OOM that I have heard
of or seen yet, but rather than cycling between 50 and a couple hundred megabytes of RAM,
the usage jumps up to what is available. It doesn't drop back down until you do a commit.
>>>
>>> Interested, I started indexing millions of docs with my benchmark work. And I
didn't see the problem. Based on some early profiling by Alexey, it looked like buffered deletes
where involved (by default, Solr always uses update to maintain unique ids). I indexed about
13 million docs, and RAM usage looked nice. After a bit of digging though, I saw that the
doc maker was not assigning id's sequentially for some reason - it was assigning the same
id a bunch of times in a row before incrementing it. Odd - so I fixed this to increment on
every document. And now I see the problem right away easily. Memory consumption just goes
up, up, up and tops out near the max available.
>>>
>>> Still investigating. I have not tried with pure Lucene yet, but it looks like
a pure Lucene issue to me so far. I see that in late June Mike fixed something related to
buffered deletes - perhaps there is still something off in how ram usage is tracked for deletes?
>>>
>>> - Mark Miller
>>> lucidimagination.com
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>> - Mark Miller
>> lucidimagination.com
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message