zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Junqueira <fpjunque...@yahoo.com.INVALID>
Subject Re: What goes in the snapshot?
Date Mon, 16 Mar 2015 16:51:05 GMT
Hi Karol,
I'll have a look. We are a bit busy with the next release for the 3.5 branch (RC should go
out this week), so don't worry if we don't say anything about them in the next few days. If
we take too long, feel free to ping me on this list again.

     On Monday, March 16, 2015 4:23 PM, "Dudzinski, Karol" <Karol.Dudzinski@gs.com>

 Hi Flavio,

I've created a JIRA for this: https://issues.apache.org/jira/browse/ZOOKEEPER-2141  I'll upload
a patch to demonstrate the approach I was considering shortly.

While I was at it, I submitted a few other JIRAs for some issues we've hit.  I'm happy to
submit patches for all of them but would appreciate some comments from the committers about
the approaches  or even the validity of what I'm suggesting.

The other JIRAs are:


The Goldman Sachs Group, Inc. All rights reserved.
See http://www.gs.com/disclaimer/global_email for important risk disclosures, conflicts of
interest and other terms and conditions relating to this e-mail and your reliance on information
contained in it.  This message may contain confidential or privileged information.  If you
are not the intended recipient, please advise us immediately and delete this message.  See
http://www.gs.com/disclaimer/email for further information on confidentiality and the risks
of non-secure electronic communication.  If you cannot access these links, please notify
us by reply message and we will send the contents to you.

-----Original Message-----
From: Flavio Junqueira [mailto:fpjunqueira@yahoo.com.INVALID] 
Sent: 26 February 2015 22:53
To: user@zookeeper.apache.org
Cc: adam@milne-smith.co.uk
Subject: Re: What goes in the snapshot?

Hi Karol,

The use of reference counters might be a good way around it. To make it backward compatible,
I think we can optionally use the counters if the third map is present in the snapshot. Would
it work?

I also think it would be good to create a jira for this so that we can track this discussion
and propose patches.


> On 26 Feb 2015, at 13:13, Karol Dudzinski <karoldudzinski@gmail.com> wrote:
> Hi Flavio,
> We've done some more analysis using the snapshot formatter and a heap dump and have found
the source of the snapshot bloat.
> What is taking  the majority of the space is the longKeyMap from DataTree.  In the
heapdump, aclKeyMap has as many entries (which is to be expected given how the maps are used)
and is also taking an equally large amount of space though at least aclKeyMap isn't serialised
to the snapshot.
> We use a custom authentication provider but because the AuthenticationProvider.matches
method does not provide the path being operated on, we end up sticking the path in the ACL
id.  Some of our apps end up generating a lot of paths for one time use and consequently
we end up with lots of unique ACLs.
> The two ACL maps in DataTree seem to be an optimisation so that repeated usage of ACLs
does not result in the full list being stored multiple times.  However, these two maps are
never removed from so if an ACL is unique these maps (and the snapshot) grow forever.
> We're quite keen on fixing this as it's causing us lots of issues and we're happy to
provide a patch but will need your opinion on the various options:
> - create a third map which would be a reference count for the ACLs which can be updated
as needed when creating, deleting or setting ACL.  When the reference count is 0, remove
the entry from all the maps
> - use weak references in some shape or form though this is made harder by the fact that
ACL optimisation essentially needs a bidirectional index (hence the two maps).  We've given
this one lots of thought but it would really require something like a ConcurrentWeakBiHashMap
which just sounds wrong and over engineered :)
> The other fix that could be made is to pass the path being operated on to the AuthenticationProvider. 
However, doing that in a backwards compatible fashion is not trivial and even though it would
fix my problem (by allowing me to remove the path from the ACL id) it wouldn't fix the general
problem with this optimisation.
> Looking forward to hearing your thoughts on this.
> Thanks,
> Karol
>> On 22 Feb 2015, at 14:55, Flavio Junqueira <fpjunqueira@yahoo.com.INVALID>
>> Hi Karol,
>> It's odd that you have such large snapshots and little data in the data tree. Are
you creating lots of sessions? Right now I can't think of a good reason, I suggest you really
use the snapshot formatter to inspect the snapshot. 
>> -Flavio
>>> On 22 Feb 2015, at 14:23, Karol Dudzinski <karoldudzinski@gmail.com> wrote:
>>> Hi Flavio,
>>> Yes, one of ours clients had a bug which caused it to go into a create/delete
tight loop with zero net effect (I.e. It was deleting what it had just created). After stopping
the client, the snapshot never reduced in size so are the deletes in there permanently?
>>> Thanks,
>>> Karol
>>>> On 22 Feb 2015, at 14:05, Flavio Junqueira <fpjunqueira@yahoo.com.INVALID>
>>>> Hi there,
>>>> Perhaps a lot of data has been deleted? In any case, you may want to use
the SnapshotFormatter to check what is in the large snapshot.
>>>> -Flavio
>>>>> On 22 Feb 2015, at 10:44, Karol Dudzinski <karoldudzinski@gmail.com>
>>>>> Hi all,
>>>>> I was under the impression that the snapshot contained essentially an
on-disk copy of all the data.  However, one of our clusters has a snapshot which is over
1GB while the mntr four letter word reports an approximate data size in the hundreds of KB
and a node count in the low thousands.  So what else goes into the snapshot and how can I
slim it down?
>>>>> Thanks,
>>>>> Karol

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message