accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Newton <eric.new...@gmail.com>
Subject Re: Standby GC OutOfMemoryError
Date Mon, 16 Dec 2013 16:22:27 GMT
I didn't leave it running.

Since you have been able reproduce the problem, could you dump the memory
after a few days, and see if there's any obvious reasons why it would run
out of memory?  It just doesn't make any sense.

-Eric


On Fri, Dec 13, 2013 at 4:17 PM, Terry P. <texpilot@gmail.com> wrote:

> Hi Eric,
> Did your standby GC eventually fail on you with an OOME?
>
> I was able to reproduce the standby GC failure after about 6 days running
> in standby mode again on my Ops cluster after a cluster restart following
> the Thanksgiving holiday week.
>
> Thanks,
> Terry
>
>
>
> On Wed, Dec 4, 2013 at 4:33 PM, Terry P. <texpilot@gmail.com> wrote:
>
>> Thanks for testing Eric. I have nproc set hard to 32000 on all my nodes,
>> and just double checked it's correct on the Secondary Namenode where this
>> is happening on. nofile is set hard to 64000 so it's not a files/sockets
>> issue either.
>>
>> Please let me know if it eventually fails for you.
>>
>>
>>
>> On Wed, Dec 4, 2013 at 3:41 PM, Eric Newton <eric.newton@gmail.com>wrote:
>>
>>> I fired up a standby GC after I reduced the wait time between zookeeper
>>> lock checks to 10ms, and changed the memory from 256M to 56M.  It's been
>>> running for the last hour.  I didn't expect it to fail, but I wanted to
>>> make sure it wasn't reproducible.
>>>
>>> It's possible that you are running up against an nproc limit.   We've
>>> seen out of memory issues when the JVM can't create a new thread.
>>>
>>> (I ran the test with 1.4.5-SNAPSHOT)
>>>
>>> -Eric
>>>
>>>
>>> On Wed, Dec 4, 2013 at 2:31 PM, Eric Newton <eric.newton@gmail.com>wrote:
>>>
>>>> I don't know if anyone is running a standby GC.  Can you go ahead and
>>>> open a ticket?
>>>>
>>>> -Eric
>>>>
>>>>
>>>>
>>>> On Wed, Dec 4, 2013 at 2:29 PM, Terry P. <texpilot@gmail.com> wrote:
>>>>
>>>>> Greetings folks,
>>>>> With Accumulo 1.4.2 I'm running Standby Master and GC processes on my
>>>>> Secondary Namenode. I've found that my Standby GC gets terminated due
to
>>>>> OutOfMemoryError errors after about 6 days of running, even though it
is
>>>>> running in standby mode only. The Standby Master is still running fine
>>>>> after 3 weeks of standby mode.
>>>>>
>>>>> My accumulo-env.sh script is using the 3GB environment default GC
>>>>> memory options of -Xmx256m -Xms256m, same as on the Master where the
>>>>> primary GC runs and has never gotten an OOME. At this point I don't see
any
>>>>> reason to try increasing that as my bet is it will only delay the OOME.
>>>>>
>>>>> Anyone else running standby GC and running into this (or working fine)?
>>>>>
>>>>
>>>>
>>>
>>
>

Mime
View raw message