zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Camille Fournier <cami...@apache.org>
Subject Re: Consistently running out of heap space
Date Fri, 05 Sep 2014 15:31:43 GMT
You shouldn't use ZK to keep that data around. It's not designed to store a
ton of historical information. Thousands of jobs is no big deal, but
thousands of jobs and their history back through time is not what the
system is designed for.

C


On Fri, Sep 5, 2014 at 11:29 AM, Brian C. Huffman <
bhuffman@etinternational.com> wrote:

> We use zookeeper to keep track of the jobs we run, and we run thousands of
> jobs. When a job is finished it is no longer needed except for web
> monitoring tools. Is that considered state? We want to keep that around so
> we have a history of completed jobs. Will these stay in memory?
>
> Thanks,
> Brian
>
>
> On 09/05/2014 11:26 AM, Camille Fournier wrote:
>
>> All state is stored in memory in ZK for performance reasons. It sounds
>> like
>> you're putting more data into it than the heap will accommodate.
>> ZK is useful for references to data, but not for large amounts of actual
>> data. It's not designed to be a large data store.
>>
>> Thanks,
>> C
>>
>>
>> On Fri, Sep 5, 2014 at 10:33 AM, Brian C. Huffman <
>> bhuffman@etinternational.com> wrote:
>>
>>  Flavio,
>>>
>>> I was having the same problems on 3.4.5 so I upgraded to 3.4.6.  So it
>>> doesn't seem to be related to the version.
>>>
>>> You might be right about the storing of state.  I'm curious - does the
>>> "state" consist of the entire node listing? Is there anyway to tell
>>> zookeeper to keep a node around but only on disk?
>>>
>>> Thanks,
>>> Brian
>>>
>>>
>>> On 09/05/2014 09:47 AM, Flavio Junqueira wrote:
>>>
>>>  Brian,
>>>>
>>>> How much state are you storing in ZK? Can you check the size of the
>>>> snapshots?
>>>>
>>>> One common problem when folks are testing is that they forget to delete
>>>> the data from previous tests, so the state keeps accumulating and the
>>>> server keeps crashing because the state is too large.
>>>>
>>>> Also, consider trying 3.4.5 just to see if it is a problem with 3.4.6
>>>> alone.
>>>>
>>>> -Flavio
>>>>
>>>>
>>>> On Friday, September 5, 2014 2:23 PM, Brian C. Huffman <
>>>> bhuffman@etinternational.com> wrote:
>>>>
>>>>
>>>>  We're running the latest version of the stable 3.4 branch (3.4.6) and
>>>>> have been consistently having problems running out of heap space.
>>>>>
>>>>> We're running a single server (redundancy isn't a concern at this
>>>>> point)
>>>>> and I've tried the defaults (which seems to use Java's default heap of
>>>>> 8GB) as well as limiting to 3GB.  Either way the Zookeeper server
>>>>> eventually dies.  With larger heap size it seems to take longer to die.
>>>>>
>>>>> Here's the latest trace:
>>>>> 2014-09-05 00:51:11,419 [myid:] - ERROR
>>>>> [SyncThread:0:SyncRequestProcessor@183] - Severe unrecoverable error,
>>>>> exiting
>>>>> java.lang.OutOfMemoryError: Java heap space
>>>>>       at java.util.Arrays.copyOf(Arrays.java:2271)
>>>>>       at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.
>>>>> java:113)
>>>>>       at
>>>>> java.io.ByteArrayOutputStream.ensureCapacity(
>>>>> ByteArrayOutputStream.java:93)
>>>>>       at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.
>>>>> java:140)
>>>>>       at java.io.DataOutputStream.write(DataOutputStream.java:107)
>>>>>       at java.io.FilterOutputStream.write(FilterOutputStream.java:97)
>>>>>       at
>>>>> org.apache.jute.BinaryOutputArchive.writeBuffer(
>>>>> BinaryOutputArchive.java:119)
>>>>>       at org.apache.zookeeper.txn.Txn.serialize(Txn.java:49)
>>>>>       at
>>>>> org.apache.jute.BinaryOutputArchive.writeRecord(
>>>>> BinaryOutputArchive.java:123)
>>>>>       at org.apache.zookeeper.txn.MultiTxn.serialize(MultiTxn.java:44)
>>>>>       at
>>>>> org.apache.zookeeper.server.persistence.Util.
>>>>> marshallTxnEntry(Util.java:
>>>>> 263)
>>>>>       at
>>>>> org.apache.zookeeper.server.persistence.FileTxnLog.append(
>>>>> FileTxnLog.java:216)
>>>>>       at
>>>>> org.apache.zookeeper.server.persistence.FileTxnSnapLog.
>>>>> append(FileTxnSnapLog.java:314)
>>>>>       at org.apache.zookeeper.server.ZKDatabase.append(ZKDatabase.
>>>>> java:476)
>>>>>       at
>>>>> org.apache.zookeeper.server.SyncRequestProcessor.run(
>>>>> SyncRequestProcessor.java:140)
>>>>> 2014-09-05 00:51:07,866 [myid:] - WARN
>>>>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught
>>>>> end of stream exception
>>>>> EndOfStreamException: Unable to read additional data from client
>>>>> sessionid 0x14837ac98960071, likely client has closed socket
>>>>>       at
>>>>> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
>>>>>       at
>>>>> org.apache.zookeeper.server.NIOServerCnxnFactory.run(
>>>>> NIOServerCnxnFactory.java:208)
>>>>>       at java.lang.Thread.run(Thread.java:745)
>>>>>
>>>>> Here's my configuration:
>>>>> [user@xyz conf]$ grep -v '^#' zoo.cfg
>>>>> tickTime=2000
>>>>> initLimit=10
>>>>> syncLimit=5
>>>>> dataDir=/usr/local/var/zookeeper
>>>>> clientPort=2181
>>>>> autopurge.snapRetainCount=3
>>>>> autopurge.purgeInterval=1
>>>>>
>>>>> Can anyone suggest what the issue could be?
>>>>>
>>>>> Thanks,
>>>>> Brian
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message